Menu Content/Inhalt
Seminars Print
previous year previous month next month next year
See by year See by month See by week See Today Search Jump to month
Tom Berrett, Cambridge U. Print
Thursday, 22 November 2018, 12:15 - 13:15

Tom Berrett, Cambridge University

 Efficient multivariate entropy estimation and independence testing 

Efficient multivariate entropy estimation via k-nearest neighbour distances 

 Nonparametric independence testing via mutual information

Abstract : Many statistical procedures, including goodness-of-fit tests and methods for independent component analysis, rely critically on the estimation of the entropy of a distribution. In this talk I will first describe new entropy estimators that are efficient and achieve the local asymptotic minimax lower bound with respect to squared error loss. These estimators are constructed as weighted averages of the estimators originally proposed by Kozachenko and Leonenko (1987), based on the k-nearest neighbour distances of a sample of n independent and identically distributed random vectors in d dimensions. A careful choice of weights enables us to obtain an efficient estimator for arbitrary d, given sufficient smoothness, while the original unweighted estimator is typically only efficient for d up to 3.

The next part of the talk will be to use our entropy estimators to propose a test of independence of two multivariate random vectors, given a sample from the underlying population. Our approach, which we call MINT, is based on the estimation of mutual information, which we may decompose into joint and marginal entropies. The proposed critical values, which may be obtained from simulation in the case where an approximation to one marginal is available or resampling otherwise, facilitate size guarantees, and we provide local power analyses, uniformly over classes of densities whose mutual information satisfies a lower bound. Our ideas may be extended to provide a new goodness-of-fit tests of normal linear models based on assessing the independence of our vector of covariates and an appropriately-defined notion of an error vector.


Location: R42.2.113
Contact: Nancy De Munck - This e-mail address is being protected from spam bots, you need JavaScript enabled to view it