Hannes Leeb, Vienna, University Homepage On the Conditional Distributions of Low-Dimensional Projections from High-Dimensional Data Abstract: We study the conditional distribution of low-dimensional projections from high-dimensional data, where the conditioning is on other low-dimensional projections. To x ideas, consider a random d-vector Z that has a Lebesgue density and that is standardized so that EZ = 0 and EZZ0 = Id. Moreover, consider two projections de ned by unit vectors and , namely a response y = 0Z and an explanatory vari- able x = 0Z. It has long been known that the conditional mean of y given x is approximately linear in x (under some regularity conditions); cf. Hall and Li (1993). However, a corresponding result for the conditional variance has not been available so far. We here show that the conditional variance of y given x is approximately constant in x (again, under some regularity conditions). These results hold uniformly in and for most 's, provided only that the dimension of Z is large. In that sense, we see that most linear submodels of a high-dimensional overall model are approximately correct. Our ndings provide new insights in a variety of modeling scenarios. We discuss several examples, including sparse linear modeling, generalized linear models under potential link violation, sliced inverse regression, sliced average variance estimation, and kernel learning machines. |