Consider the standard linear regression model y = Xβ~* + w, where y ∈R~n is an observation vector, X ∈ R~(n×d) is a measurement matrix, β~* ∈ R~d is the unknown regression vector, and w ~ N(O,σ~2I) is additive Gaussian noise. This paper determines sharp minimax rates of convergence for estimation of β* in e_2 norm, assuming that β~* belongs to a weak e_q-ball B_q (R_q) for some q ∈ [0,1]. We show that under suitable regularity conditions on the design matrix X, the minimax error in squared e_2-norm scales as R_q((log d))~(1-(q/2)). In addition, we provide lower bounds on rates of convergence for general e_p norm (for all p ∈ [1,+∞],p ≠ q). Our proofs of the lower bounds are information-theoretic in nature, based on Fano's inequality and results on the metric entropy of the balls B_q(R_q). Matching upper bounds are derived by direct analysis of the solution to an optimization algorithm over B_q(R_q). We prove that the conditions on X required by optimal algorithms are satisfied with high probability by broad classes of non-i.i.d. Gaussian random matrices, for which RIP or other sparse eigenvalue conditions are violated. For q = 0, e_1-based methods (Lasso and Dantzig selector) achieve the minimax optimal rates in e_2 error, but require stronger regularity conditions on the design than the non-convex optimization algorithm used to determine the minimax upper bounds.
展开▼