We gratefully acknowledge support from
the Simons Foundation and member institutions.

Statistics Theory

New submissions

[ total of 16 entries: 1-16 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 10 May 24

[1]  arXiv:2405.05344 [pdf, other]
Title: A note on the minimax risk of sparse linear regression
Subjects: Statistics Theory (math.ST)

Sparse linear regression is one of the classical and extensively studied problems in high-dimensional statistics and compressed sensing. Despite the substantial body of literature dedicated to this problem, the precise determination of its minimax risk remains elusive. This paper aims to fill this gap by deriving asymptotically constant-sharp characterization for the minimax risk of sparse linear regression. More specifically, the paper focuses on scenarios where the sparsity level, denoted as k, satisfies the condition $(k \log p)/n {\to} 0$, with p and n representing the number of features and observations respectively. We establish that the minimax risk under isotropic Gaussian random design is asymptotically equal to $2{\sigma}^2k/n log(p/k)$, where ${\sigma}$ denotes the standard deviation of the noise. In addition to this result, we will summarize the existing results in the literature, and mention some of the fundamental problems that have still remained open.

[2]  arXiv:2405.05419 [pdf, other]
Title: Decompounding Under General Mixing Distributions
Comments: 21 page, 2 figures
Subjects: Statistics Theory (math.ST); Methodology (stat.ME)

This study focuses on statistical inference for compound models of the form $X=\xi_1+\ldots+\xi_N$, where $N$ is a random variable denoting the count of summands, which are independent and identically distributed (i.i.d.) random variables $\xi_1, \xi_2, \ldots$. The paper addresses the problem of reconstructing the distribution of $\xi$ from observed samples of $X$'s distribution, a process referred to as decompounding, with the assumption that $N$'s distribution is known. This work diverges from the conventional scope by not limiting $N$'s distribution to the Poisson type, thus embracing a broader context. We propose a nonparametric estimate for the density of $\xi$, derive its rates of convergence and prove that these rates are minimax optimal for suitable classes of distributions for $\xi$ and $N$. Finally, we illustrate the numerical performance of the algorithm on simulated examples.

[3]  arXiv:2405.05597 [pdf, ps, other]
Title: The empirical copula process in high dimensions: Stute's representation and applications
Comments: 24 pages
Subjects: Statistics Theory (math.ST)

The empirical copula process, a fundamental tool for copula inference, is studied in the high dimensional regime where the dimension is allowed to grow to infinity exponentially in the sample size. Under natural, weak smoothness assumptions on the underlying copula, it is shown that Stute's representation is valid in the following sense: all low-dimensional margins of fixed dimension of the empirical copula process can be approximated by a functional of the low-dimensional margins of the standard empirical process, with the almost sure error term being uniform in the margins. The result has numerous potential applications, and is exemplary applied to the problem of testing pairwise stochastic independence in high dimensions, leading to various extensions of recent results in the literature: for certain test statistics based on pairwise association measures, type-I error control is obtained for models beyond mutual independence. Moreover, bootstrap-based critical values are shown to yield strong control of the familywise error rate for a large class of data generating processes.

[4]  arXiv:2405.05644 [pdf, other]
Title: Estimation of ill-conditioned models using penalized sums of squares of the residuals
Comments: Working paper with 35 pages, 9 tables, 7 figures
Subjects: Statistics Theory (math.ST)

This paper analyzes the estimation of econometric models by penalizing the sum of squares of the residuals with a factor that makes the model estimates approximate those that would be obtained when considering the possible simple regressions between the dependent variable of the econometric model and each of its independent variables. It is shown that the ridge estimator is a particular case of the penalized estimator obtained, which, upon analysis of its main characteristics, presents better properties than the ridge especially in reference to the individual boostrap inference of the coefficients of the model and the numerical stability of the estimates obtained. This improvement is due to the fact that instead of shrinking the estimator towards zero, the estimator shrinks towards the estimates of the coefficients of the simple regressions discussed above.

[5]  arXiv:2405.05656 [pdf, ps, other]
Title: Consistent Empirical Bayes estimation of the mean of a mixing distribution without identifiability assumption. With applications to treatment of non-response
Comments: 17 pages
Subjects: Statistics Theory (math.ST); Methodology (stat.ME)

{\bf Abstract}
Consider a Non-Parametric Empirical Bayes (NPEB) setup. We observe $Y_i, \sim f(y|\theta_i)$, $\theta_i \in \Theta$ independent, where $\theta_i \sim G$ are independent $i=1,...,n$. The mixing distribution $G$ is unknown $G \in \{G\}$ with no parametric assumptions about the class $\{G \}$. The common NPEB task is to estimate $\theta_i, \; i=1,...,n$. Conditions that imply 'optimality' of such NPEB estimators typically require identifiability of $G$ based on $Y_1,...,Y_n$. We consider the task of estimating $E_G \theta$. We show that `often' consistent estimation of $E_G \theta$ is implied without identifiability.
We motivate the later task, especially in setups with non-response and missing data. We demonstrate consistency in simulations.

[6]  arXiv:2405.05679 [pdf, other]
Title: Non-asymptotic estimates for accelerated high order Langevin Monte Carlo algorithms
Subjects: Statistics Theory (math.ST); Probability (math.PR); Computation (stat.CO); Machine Learning (stat.ML)

In this paper, we propose two new algorithms, namely aHOLA and aHOLLA, to sample from high-dimensional target distributions with possibly super-linearly growing potentials. We establish non-asymptotic convergence bounds for aHOLA in Wasserstein-1 and Wasserstein-2 distances with rates of convergence equal to $1+q/2$ and $1/2+q/4$, respectively, under a local H\"{o}lder condition with exponent $q\in(0,1]$ and a convexity at infinity condition on the potential of the target distribution. Similar results are obtained for aHOLLA under certain global continuity conditions and a dissipativity condition. Crucially, we achieve state-of-the-art rates of convergence of the proposed algorithms in the non-convex setting which are higher than those of the existing algorithms. Numerical experiments are conducted to sample from several distributions and the results support our main findings.

Cross-lists for Fri, 10 May 24

[7]  arXiv:2405.05403 (cross-list from stat.ME) [pdf, other]
Title: A fast and accurate inferential method for complex parametric models: the implicit bootstrap
Subjects: Methodology (stat.ME); Statistics Theory (math.ST); Computation (stat.CO)

Performing inference such a computing confidence intervals is traditionally done, in the parametric case, by first fitting a model and then using the estimates to compute quantities derived at the asymptotic level or by means of simulations such as the ones from the family of bootstrap methods. These methods require the derivation and computation of a consistent estimator that can be very challenging to obtain when the models are complex as is the case for example when the data exhibit some types of features such as censoring, missclassification errors or contain outliers. In this paper, we propose a simulation based inferential method, the implicit bootstrap, that bypasses the need to compute a consistent estimator and can therefore be easily implemented. While being transformation respecting, we show that under similar conditions as for the studentized bootstrap, without the need of a consistent estimator, the implicit bootstrap is first and second order accurate. Using simulation studies, we also show the coverage accuracy of the method with data settings for which traditional methods are computationally very involving and also lead to poor coverage, especially when the sample size is relatively small. Based on these empirical results, we also explore theoretically the case of exact inference.

[8]  arXiv:2405.05459 (cross-list from stat.ME) [pdf, other]
Title: Estimation and Inference for Change Points in Functional Regression Time Series
Subjects: Methodology (stat.ME); Statistics Theory (math.ST)

In this paper, we study the estimation and inference of change points under a functional linear regression model with changes in the slope function. We present a novel Functional Regression Binary Segmentation (FRBS) algorithm which is computationally efficient as well as achieving consistency in multiple change point detection. This algorithm utilizes the predictive power of piece-wise constant functional linear regression models in the reproducing kernel Hilbert space framework. We further propose a refinement step that improves the localization rate of the initial estimator output by FRBS, and derive asymptotic distributions of the refined estimators for two different regimes determined by the magnitude of a change. To facilitate the construction of confidence intervals for underlying change points based on the limiting distribution, we propose a consistent block-type long-run variance estimator. Our theoretical justifications for the proposed approach accommodate temporal dependence and heavy-tailedness in both the functional covariates and the measurement errors. Empirical effectiveness of our methodology is demonstrated through extensive simulation studies and an application to the Standard and Poor's 500 index dataset.

[9]  arXiv:2405.05512 (cross-list from cs.LG) [pdf, other]
Title: Characteristic Learning for Provable One Step Generation
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Numerical Analysis (math.NA); Statistics Theory (math.ST)

We propose the characteristic generator, a novel one-step generative model that combines the efficiency of sampling in Generative Adversarial Networks (GANs) with the stable performance of flow-based models. Our model is driven by characteristics, along which the probability density transport can be described by ordinary differential equations (ODEs). Specifically, We estimate the velocity field through nonparametric regression and utilize Euler method to solve the probability flow ODE, generating a series of discrete approximations to the characteristics. We then use a deep neural network to fit these characteristics, ensuring a one-step mapping that effectively pushes the prior distribution towards the target distribution. In the theoretical aspect, we analyze the errors in velocity matching, Euler discretization, and characteristic fitting to establish a non-asymptotic convergence rate for the characteristic generator in 2-Wasserstein distance. To the best of our knowledge, this is the first thorough analysis for simulation-free one step generative models. Additionally, our analysis refines the error analysis of flow-based generative models in prior works. We apply our method on both synthetic and real datasets, and the results demonstrate that the characteristic generator achieves high generation quality with just a single evaluation of neural network.

Replacements for Fri, 10 May 24

[10]  arXiv:2206.06491 (replaced) [pdf, other]
Title: On the Computational Complexity of Metropolis-Adjusted Langevin Algorithms for Bayesian Posterior Sampling
Authors: Rong Tang, Yun Yang
Subjects: Statistics Theory (math.ST)
[11]  arXiv:2212.09706 (replaced) [pdf, ps, other]
Title: Multiple testing under negative dependence
Comments: 28 pages, 5 figures
Subjects: Statistics Theory (math.ST); Probability (math.PR); Methodology (stat.ME)
[12]  arXiv:2303.08122 (replaced) [pdf, ps, other]
Title: Codivergences and information matrices
Comments: 30 pages, 1 figure, 1 table. This is an extended version of Section 2.2 of arXiv:2006.00278v3 (most of this content has been removed in the next version (arXiv:2006.00278v4) and link to this separate paper instead)
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT); Probability (math.PR)
[13]  arXiv:2309.08538 (replaced) [pdf, ps, other]
Title: Jittering and Clustering: Strategies for the Construction of Robust Designs
Authors: Douglas Wiens
Subjects: Statistics Theory (math.ST)
[14]  arXiv:2311.04618 (replaced) [pdf, other]
Title: Multivariate generalized Pareto distributions along extreme directions
Subjects: Statistics Theory (math.ST)
[15]  arXiv:2211.07861 (replaced) [pdf, other]
Title: Regularized Stein Variational Gradient Flow
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Analysis of PDEs (math.AP); Numerical Analysis (math.NA); Statistics Theory (math.ST); Computation (stat.CO)
[16]  arXiv:2309.12544 (replaced) [pdf, ps, other]
Title: Stability and Statistical Inversion of Travel time Tomography
Subjects: Differential Geometry (math.DG); Statistics Theory (math.ST)
[ total of 16 entries: 1-16 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, math, recent, 2405, contact, help  (Access key information)