Given n observations x1, ..., xn, the generalized mean (also called power mean) is defined as
The case p = 1 corresponds to the traditional arithmetic mean, while p = 0 yields the geometric mean, and p = -1 yields the harmonic mean. See here for details. This metric is favored by statisticians. It is a particular case of the quasi-arithmetic mean.
Here I introduce another kind of mean called exponential mean, also based on a parameter p, that may have an appeal to data scientists and machine learning professionals. It is also a special case of the quasi-arithmetic mean. Though the concept is basic, there is very little if any literature about it. It is related to the LogSumExp and the Log semiring. It is defined as follows:
Here the logarithm is in base p, with p positive. When p tends to 0, mp is the minimum of the observations. When p tends to 1, it yields the classic arithmetic mean, and as p tends to infinity, it yields the maximum of the observations.
Advantages of the exponential mean
One advantage of the exponential mean is that it always exists even if the observations take on negative values. This is not the case in general for the power mean if p is not an integer. Also, the exponential mean is stable under translation, unlike the power mean. That is, if you add a constant a to each observation, then the exponential mean is shifted by the same quantity a. In short, mp(x1 + a, ..., xn + a) = a + mp(x1, ..., xn). To the contrary, the power mean is stable under multiplication by a constant, while the exponential mean is not.
Finally, the central limit theorem applies both to the power and exponential means, when the number n of observations becomes larger and larger.
Illustration on a test data set
I tested both means (exponential and power means) for various values of p ranging between 0 and 2. See above chart, where the X-axis represents the parameter p, and the Y-axis represents the mean. The test data set consists of 10 numbers randomly chosen between 0 and 1, with an average value of 0.53. Note that if p = 1, then mp = Mp = 0.53 is the standard arithmetic mean.
The blue curve in the above chart is very well approximated by a logarithm function, except when p is very close to zero or p is extremely large. The red curve is well approximated by a second-degree polynomial. Convergence to the maximum of the observations (equal to 0.89 here), as p tends to infinity, occurs much faster with the power mean than with the exponential mean. Note that the min(x1, ..., xn) = 0.07 in this example, and the exponential mean will start approaching that value only when p is extremely close to zero.
Important inequality
This inequality, valid for the power mean Mp and resulting from the convexity of some underlying function, also applies to the exponential mean mp:
- If p < q, then mp ≤ mq, and mp = mq if and only if x1 = ... = xn.
Proving this inequality is equivalent (see here) to proving that
The derivative of mp with respect to p is well approximated by a power function, positive everywhere, and suggesting that the inequality is indeed verified. Let us denote this derivative as m'p. We have (see here):
It is interesting to note that m1 is the arithmetic mean and m'1 is half the empirical variance of x1, ..., xn. It would be interesting to see how higher order derivatives of mp evaluated at p = 1, are related to higher empirical moments of x1, ..., xn.
Doubly exponential mean
Here we mention two generalizations of the exponential mean. The first one, defined as
is called the doubly exponential mean. As p tends to 1, then mp,q tends to mq. The other generalization is as follows:
Here q can be negative. When q = 1, it corresponds to mp.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.