Update: This conjecture has now been solved! See A simple proof of the Gaussian correlation conjecture extended to multivariate gamma distributions by T. Royen, and Royen’s proof of the Gaussian correlation inequality by Rafał Latała, and Dariusz Matlak.
We continue investigating the Gaussian correlation conjecture in this post. This states that if is the standard Gaussian measure on
then
(1) |
for all symmetric and convex sets . In this entry, we consider a stronger `local’ version of the conjecture, which has the advantage that it can be approached using differential calculus. Inequality (1) can alternatively be stated in terms of integrals,
(2) |
This is clearly equivalent to (2) when are indicator functions of convex symmetric sets. More generally, using linearity, it extends to all nonnegative functions such that
and
are symmetric and convex subsets of
for positive
. A class of functions lying between these two extremes, which I consider here, are the log-concave functions. The reason for concentrating on this class of functions is that they satisfy the following stability result, due to Prékopa (available here) and Leindler.
Theorem 1 If
is log-concave then so is
.
The approach used in this post is as follows; Given an nxn matrix , the idea is to consider a probability measure
with respect to which we have joint normal random variables
and
, all with zero mean and unit variance, and covariances given by
The random variables then have covariance matrix
This must be positive semidefinite or, equivalently, in the operator norm. Writing
for expectation with respect to the probability measure
, the Gaussian correlation conjecture is equivalent to the statement
(3) |
Furthermore, considering as a function of
, its partial derivatives can be calculated as follows,
(4) |
Using this, the Taylor expansion to second order about Q=0 can be written as,
(we use the summation convention where repeated indices in a product are summed over). From Theorem 1, the function is log-concave and, therefore the matrix
is positive semidefinite. So, to second order in Q,
If are strictly log-concave, so that
are positive definite, then this shows that
has a local minimum at
. This suggests the following conjecture, which is stronger than the Gaussian correlation inequality.
Conjecture: For log-concave functions on
, then over all matrices
, the function
is minimized at
. In fact, we conjecture that it never has a (strict) local minimum at any
.
Certainly, this conjecture would imply (3). Being a stronger version of the Gaussian correlation inequality, we can be less sure that this statement holds than the original conjecture. However, if it is true then it has one advantage. That is, it can be approached locally with respect to the covariances, only considering infinitesimal changes in Q. In fact, it can be shown to be equivalent to the following (for any fixed log-concave and differentiable ). I do not give the proof here, to save space, but it is just an application of equation (4).
Conjecture: For all log-concave functions , the matrix
is never symmetric and negative definite.
If this can be proven, then the correlation conjecture follows. As demonstrated in 1977 by Pitt, the Gaussian correlation conjecture will follow if this matrix always has nonnegative trace (this was used to prove the two dimensional case of the conjecture), which would certainly imply that it can’t be negative definite. It is easily shown that the matrix will indeed have nonnegative trace if the Gaussian measure is replaced by the standard (Lebesgue) measure. This is because is log-concave and symmetric in
(by Theorem 1), and is therefore maximized at
. Expanding to second order in y shows that
is symmetric and positive semidefinite. In fact, in two dimensions it can be shown that
has nonnegative trace (see Pitt). From this it follows that has nonnegative trace for all decreasing functions
. In particular, letting
be the Gaussian density, the above conjecture follows. I don’t know whether a similar argument can be carried across to an arbitrary number of dimensions.
Consider, for example, the case where is a function of
for a positive definite matrix
. This includes the cases where
is a centered ellipsoid (proven in 1999 by Hargé) and, as a limit of centered ellipsoids, where it is a symmetric slab (i.e., the region between two parallel hyperplanes, independently proven in 1967 by both Khatri and Šidák). Say,
. Then,
The final inequality follows because and
are necessarily decreasing along lines going radially out from the origin, implying that
and
are both nonpositive. Consequently,
is not positive definite in this case and the Gaussian correlation inequality follows for centered ellipsoids.
Arrgh! I’ve been using the evil we throughout my initial posts. I’ll do better in future.
It is perfectly OK to use “we” about yourself and your tapeworm — which should of course be properly credited.
By the way, is it just me, or are we lacking an RSS feed for new posts? (I.e. do we have to resort to e-mail notification?)
Thanks Porcus, I’ll bear that in mind.
And no, it’s not just you (or your tapeworm), but I’ve added RSS feeds now.
Hi,
You might be interested in this paper
Click to access 1012.0676v1.pdf
Regards
Thanks.
However, I do see a problem in that paper. How does proving the result for sets with Gaussian measure
imply it for all sets? It mentions "contracting the sets linearly", but how does that work? Certainly
for c less than one. Also, for sets contained in
,
holds. If I try scaling the sets down so that they have probability no more than 10-7, this leads to a factor of
which, rather crucially, depends on d. You can probably do better than that, but I don’t know where the claimed
comes from.
There is something else in Guan’s paper, I do not understand:
On Page 2, line 4, he seems to apply the CLT to the set
, and in the definition of
the
, so one would need some uniformity of the convergence in the CLT. I do not see
left depends on
that, can anybody help me?
Thomas Schlunprecht:
That’s a very good point! It does look like a problem. In fact, conditioning on the event
(i.e., the event
in the paper), the law of large numbers implies that the term on the left hand side of the inequality defining
almost surely grows at rate
This is at least as large as
, and is larger than the right hand side of the inequality whenever
(and I see no reason why that shouldn’t be the case). Then,
, which means that the CLT cannot give the claimed bound. Using this expression instead of the CLT gives zero on the right hand side of (2.4), which isn’t very useful, and I don’t see how the rest of the argument can follow.
So, unless I am mistaken, this does seem like a major problem in the argument.
Hi,
Thanks for your observations, they seem quite clear and serious enough to have legitimate doubts about the proof. Unfortunately, I couldn’t find the author’s e-mail so he could discuss his argument with you.
Best Regards
The author of the paper seems to be this: http://159.226.47.50:8080/iam/guanqingyang/