We study the problem of learning multivariate log-concave densities with respect to a global loss function. We obtain the first upper bound on the sample complexity of the maximum likelihood estimator (MLE) for a log-concave density on $mathbb{R}^d$, for all $d geq 4$. Prior to this work, no finite sample upper bound was known for this estimator in more than $3$ dimensions. In more detail, we prove that for any $d geq 1$ and $epsilon>0$, given $ilde{O}_d((1/epsilon)^{(d+3)/2})$ samples drawn from an unknown log-concave density $f_0$ on $mathbb{R}^d$, the MLE outputs a hypothesis $h$ that with high probability is $epsilon$-close to $f_0$, in squared Hellinger loss. A sample complexity lower bound of $Omega_d((1/epsilon)^{(d+1)/2})$ was previously known for any learning algorithm that achieves this guarantee. We thus establish that the sample complexity of the log-concave MLE is near-optimal, up to an $ilde{O}(1/epsilon)$ factor.
展开▼