## Wednesday, March 2, 2022

### The Exponential Mean: Alternative to Classic Means

Given n observations x1, ..., xn, the generalized mean (also called power mean) is defined as

The case p = 1 corresponds to the traditional arithmetic mean, while p = 0 yields the geometric mean, and p = -1 yields the harmonic mean. See here for details. This metric is favored by statisticians. It is a particular case of the quasi-arithmetic mean

Here I introduce another kind of mean called exponential mean, also based on a parameter p, that may have an appeal to data scientists and machine learning professionals. It is also a special case of the quasi-arithmetic mean. Though the concept is basic, there is very little if any literature about it. It is related to the LogSumExp and the Log semiring. It is defined as follows:

Here the logarithm is in base p, with p positive. When p tends to 0, mp is the minimum of the observations. When p tends to 1, it yields the classic arithmetic mean, and as p tends to infinity, it yields the maximum of the observations.

One advantage of the exponential mean is that it always exists even if the observations take on negative values. This is not the case in general for the power mean if p is not an integer. Also, the exponential mean is stable under translation, unlike the power mean. That is, if you add a constant a to each observation, then the exponential mean is shifted by the same quantity a. In short, mp(x1 + a, ..., xn + a) = a + mp(x1, ..., xn). To the contrary, the power mean is stable under multiplication by a constant, while the exponential mean is not.

Finally, the central limit theorem applies both to the power and exponential means, when the number n of observations becomes larger and larger.

Illustration on a test data set

I tested both means (exponential and power means) for various values of p ranging between 0 and 2. See above chart, where the X-axis represents the parameter p, and the Y-axis represents the mean. The test data set consists of 10 numbers randomly chosen between 0 and 1, with an average value of 0.53. Note that if p = 1, then mp = Mp = 0.53 is the standard arithmetic mean.

The blue curve in the above chart is very well approximated by a logarithm function, except when p is very close to zero or p is extremely large. The red curve is well approximated by a second-degree polynomial. Convergence to the maximum of the observations (equal to 0.89 here), as p tends to infinity, occurs much faster with the power mean than with the exponential mean. Note that the min(x1, ..., xn)  = 0.07 in this example, and the exponential mean will start approaching that value only when p is extremely close to zero.

Important inequality

This inequality, valid for the power mean Mp and resulting from the convexity of some underlying function, also applies to the exponential mean mp:

• If p <  q, then mp ≤ mq, and mp = mq if and only if x1 = ... = xn.

Proving this inequality is equivalent (see here) to proving that

The derivative of mp with respect to p is well approximated by a power function, positive everywhere, and suggesting that the  inequality is indeed verified. Let us denote this derivative as m'pWe have (see here):

It is interesting to note that m1 is the arithmetic mean and m'1 is half the empirical variance of x1, ..., xn. It would be interesting to see how higher order derivatives of mp evaluated at p = 1, are related to higher empirical moments of x1, ..., xn.

Doubly exponential mean

Here we mention two generalizations of the exponential mean. The first one, defined as

is called the doubly exponential mean. As p tends to 1, then mp,q tends to mq. The other generalization is as follows:

Here q can be negative. When q = 1, it corresponds to mp.