Statistics Theory
New submissions
[ showing up to 2000 entries per page: fewer  more ]
New submissions for Tue, 19 Oct 21
 [1] arXiv:2110.08523 [pdf, ps, other]

Title: Spectral measures of empirical autocovariance matrices of high dimensional Gaussian stationary processesSubjects: Statistics Theory (math.ST); Probability (math.PR)
Consider the empirical autocovariance matrix at a given nonzero time lag based on observations from a multivariate complex Gaussian stationary time series. The spectral analysis of these autocovariance matrices can be useful in certain statistical problems, such as those related to testing for white noise. We study the behavior of their spectral measures in the asymptotic regime where the time series dimension and the observation window length both grow to infinity, and at the same rate. Following a general framework in the field of the spectral analysis of large random nonHermitian matrices, at first the probabilistic behavior of the small singular values of the shifted versions of the autocovariance matrix are obtained. This is then used to infer about the large sample behaviour of the empirical spectral measure of the autocovariance matrices at any lag. Matrix orthogonal polynomials on the unit circle play a crucial role in our study.
 [2] arXiv:2110.08766 [pdf, ps, other]

Title: On minimax estimation problem for stationary stochastic sequences from observations in special sets of pointsComments: arXiv admin note: text overlap with arXiv:1804.08408Subjects: Statistics Theory (math.ST)
The problem of the meansquare optimal estimation of the linear functionals which depend on the unknown values of a stochastic stationary sequence from observations of the sequence in special sets of points is considered. Formulas for calculating the meansquare error and the spectral characteristic of the optimal linear estimate of the functionals are derived under the condition of spectral certainty, where the spectral density of the sequence is exactly known. The minimax (robust) method of estimation is applied in the case where the spectral density of the sequence is not known exactly while some sets of admissible spectral densities are given. Formulas that determine the least favourable spectral densities and the minimax spectral characteristics are derived for some special sets of admissible densities.
 [3] arXiv:2110.09042 [pdf, other]

Title: Kernelbased estimation for partially functional linear model: Minimax rates and randomized sketchesSubjects: Statistics Theory (math.ST); Machine Learning (stat.ML)
This paper considers the partially functional linear model (PFLM) where all predictive features consist of a functional covariate and a high dimensional scalar vector. Over an infinite dimensional reproducing kernel Hilbert space, the proposed estimation for PFLM is a least square approach with two mixed regularizations of a functionnorm and an $\ell_1$norm. Our main task in this paper is to establish the minimax rates for PFLM under high dimensional setting, and the optimal minimax rates of estimation is established by using various techniques in empirical process theory for analyzing kernel classes. In addition, we propose an efficient numerical algorithm based on randomized sketches of the kernel matrix. Several numerical experiments are implemented to support our method and optimization strategy.
 [4] arXiv:2110.09333 [pdf, other]

Title: Regression with Missing Data, a Comparison Study of TechniquesBased on Random ForestsSubjects: Statistics Theory (math.ST); Machine Learning (stat.ML)
In this paper we present the practical benefits of a new random forest algorithm to deal withmissing values in the sample. The purpose of this work is to compare the different solutionsto deal with missing values with random forests and describe our new algorithm performanceas well as its algorithmic complexity. A variety of missing value mechanisms (such as MCAR,MAR, MNAR) are considered and simulated. We study the quadratic errors and the bias ofour algorithm and compare it to the most popular missing values random forests algorithms inthe literature. In particular, we compare those techniques for both a regression and predictionpurpose. This work follows a first paper GomezMendez and Joly (2020) on the consistency ofthis new algorithm.
 [5] arXiv:2110.09502 [pdf, other]

Title: Minimum $\ell_{1}$norm interpolators: Precise asymptotics and multiple descentSubjects: Statistics Theory (math.ST); Information Theory (cs.IT); Machine Learning (cs.LG); Signal Processing (eess.SP); Machine Learning (stat.ML)
An evolving line of machine learning works observe empirical evidence that suggests interpolating estimators  the ones that achieve zero training error  may not necessarily be harmful. This paper pursues theoretical understanding for an important type of interpolators: the minimum $\ell_{1}$norm interpolator, which is motivated by the observation that several learning algorithms favor low $\ell_1$norm solutions in the overparameterized regime. Concretely, we consider the noisy sparse regression model under Gaussian design, focusing on linear sparsity and highdimensional asymptotics (so that both the number of features and the sparsity level scale proportionally with the sample size).
We observe, and provide rigorous theoretical justification for, a curious multidescent phenomenon; that is, the generalization risk of the minimum $\ell_1$norm interpolator undergoes multiple (and possibly more than two) phases of descent and ascent as one increases the model capacity. This phenomenon stems from the special structure of the minimum $\ell_1$norm interpolator as well as the delicate interplay between the overparameterized ratio and the sparsity, thus unveiling a fundamental distinction in geometry from the minimum $\ell_2$norm interpolator. Our finding is built upon an exact characterization of the risk behavior, which is governed by a system of two nonlinear equations with two unknowns.
Crosslists for Tue, 19 Oct 21
 [6] arXiv:2110.08348 (crosslist from qbio.PE) [pdf, other]

Title: Estimating individual admixture from finite reference databasesComments: 17 pages, 3 figuresSubjects: Populations and Evolution (qbio.PE); Statistics Theory (math.ST)
The concept of individual admixture (IA) assumes that the genome of individuals is composed of alleles inherited from $K$ ancestral populations. Each copy of each allele has the same chance $q_k$ to originate from population $k$, and together with the allele frequencies in all populations $p$ comprises the admixture model, which is the basis for software like {\sc STRUCTURE} and {\sc ADMIXTURE}. Here, we assume that $p$ is given through a finite reference database, and $q$ is estimated via maximum likelihood. Above all, we are interested in efficient estimation of $q$, and the variance of the estimator which originates from finiteness of the reference database, i.e.\ a variance in $p$. We provide a central limit theorem for the maximumlikelihood estimator, give simulation results, and discuss applications in forensic genetics.
 [7] arXiv:2110.08500 (crosslist from stat.ML) [pdf, other]

Title: On Model Selection Consistency of Lasso for HighDimensional Ising Models on Treelike GraphsComments: 30 pages, 4 figuresSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
We consider the problem of highdimensional Ising model selection using neighborhoodbased least absolute shrinkage and selection operator (Lasso). It is rigorously proved that under some mild coherence conditions on the population covariance matrix of the Ising model, consistent model selection can be achieved with sample sizes $n=\Omega{(d^3\log{p})}$ for any treelike graph in the paramagnetic phase, where $p$ is the number of variables and $d$ is the maximum node degree. When the same conditions are imposed directly on the sample covariance matrices, it is shown that a reduced sample size $n=\Omega{(d^2\log{p})}$ suffices. The obtained sufficient conditions for consistent model selection with Lasso are the same in the scaling of the sample complexity as that of $\ell_1$regularized logistic regression. Given the popularity and efficiency of Lasso, our rigorous analysis provides a theoretical backing for its practical use in Ising model selection.
 [8] arXiv:2110.08570 (crosslist from stat.ME) [pdf, other]

Title: A ReducedBias Weighted least square estimation of the Extreme Value IndexComments: 24 pagesSubjects: Methodology (stat.ME); Statistics Theory (math.ST); Applications (stat.AP)
In this paper, we propose a reducedbias estimator of the EVI for Paretotype tails (heavytailed) distributions. This is derived using the weighted least squares method. It is shown that the estimator is unbiased, consistent and asymptotically normal under the secondorder conditions on the underlying distribution of the data. The finite sample properties of the proposed estimator are studied through a simulation study. The results show that it is competitive to the existing estimators of the extreme value index in terms of bias and Mean Square Error. In addition, it yields estimates of $\gamma>0$ that are less sensitive to the number of toporder statistics, and hence, can be used for selecting an optimal tail fraction. The proposed estimator is further illustrated using practical datasets from pedochemical and insurance.
 [9] arXiv:2110.08665 (crosslist from stat.ME) [pdf, other]

Title: Quantile Regression by Dyadic CARTSubjects: Methodology (stat.ME); Statistics Theory (math.ST)
In this paper we propose and study a version of the Dyadic Classification and Regression Trees (DCART) estimator from Donoho (1997) for (fixed design) quantile regression in general dimensions. We refer to this proposed estimator as the QDCART estimator. Just like the mean regression version, we show that a) a fast dynamic programming based algorithm with computational complexity $O(N \log N)$ exists for computing the QDCART estimator and b) an oracle risk bound (trading off squared error and a complexity parameter of the true signal) holds for the QDCART estimator. This oracle risk bound then allows us to demonstrate that the QDCART estimator enjoys adaptively rate optimal estimation guarantees for piecewise constant and bounded variation function classes. In contrast to existing results for the DCART estimator which requires subgaussianity of the error distribution, for our estimation guarantees to hold we do not need any restrictive tail decay assumptions on the error distribution. For instance, our results hold even when the error distribution has no first moment such as the Cauchy distribution. Apart from the Dyadic CART method, we also consider other variant methods such as the Optimal Regression Tree (ORT) estimator introduced in Chatterjee and Goswami (2019). In particular, we also extend the ORT estimator to the quantile setting and establish that it enjoys analogous guarantees. Thus, this paper extends the scope of these globally optimal regression tree based methodologies to be applicable for heavy tailed data. We then perform extensive numerical experiments on both simulated and real data which illustrate the usefulness of the proposed methods.
 [10] arXiv:2110.08884 (crosslist from stat.ML) [pdf, other]

Title: Persuasion by Dimension ReductionComments: arXiv admin note: text overlap with arXiv:2102.10909Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); General Economics (econ.GN); Statistics Theory (math.ST); Methodology (stat.ME)
How should an agent (the sender) observing multidimensional data (the state vector) persuade another agent to take the desired action? We show that it is always optimal for the sender to perform a (nonlinear) dimension reduction by projecting the state vector onto a lowerdimensional object that we call the "optimal information manifold." We characterize geometric properties of this manifold and link them to the sender's preferences. Optimal policy splits information into "good" and "bad" components. When the sender's marginal utility is linear, revealing the full magnitude of good information is always optimal. In contrast, with concave marginal utility, optimal information design conceals the extreme realizations of good information and only reveals its direction (sign). We illustrate these effects by explicitly solving several multidimensional Bayesian persuasion problems.
 [11] arXiv:2110.08905 (crosslist from stat.AP) [pdf, other]

Title: Exploitation of error correlation in a large analysis validation: GlobCurrent case studyAuthors: Richard E. Danielson, Johnny A. Johannessen, Graham D. Quartly, MarieHélène Rio, Bertrand Chapron, Fabrice Collard, Craig DonlonComments: 24 pages, 14 figuresJournalref: Remote Sens. Environ., 217, 476490 (2018)Subjects: Applications (stat.AP); Statistics Theory (math.ST); Methodology (stat.ME)
An assessment of variance in ocean current signal and noise shared by in situ observations (drifters) and a large gridded analysis (GlobCurrent) is sought as a function of day of the year for 19932015 and across a broad spectrum of current speed. Regardless of the division of collocations, it is difficult to claim that any synoptic assessment can be based on independent observations. Instead, a measurement model that departs from ordinary linear regression by accommodating error correlation is proposed. The interpretation of independence is explored by applying Fuller's (1987) concept of equation and measurement error to a division of error into shared (correlated) and unshared (uncorrelated) components, respectively. The resulting division of variance in the new model favours noise. Ocean current shared (equation) error is of comparable magnitude to unshared (measurement) error and the latter is, for GlobCurrent and drifters respectively, comparable to ordinary and reverse linear regression. Although signal variance appears to be small, its utility as a measure of agreement between two variates is highlighted.
Sparse collocations that sample a dense grid permit a first order autoregressive form of measurement model to be considered, including parameterizations of analysisin situ error crosscorrelation and analysis temporal error autocorrelation. The former (crosscorrelation) is an equation error term that accommodates error shared by both GlobCurrent and drifters. The latter (autocorrelation) facilitates an identification and retrieval of all model parameters. Solutions are sought using a prescribed calibration between GlobCurrent and drifters (by variance matching). Because the true current variance of GlobCurrent and drifters is small, signal to noise ratio is near zero at best. This is particularly evident for moderate current speed and meridional current component.  [12] arXiv:2110.08969 (crosslist from stat.AP) [pdf, ps, other]

Title: On completing a measurement model by symmetryAuthors: Richard E. DanielsonComments: 4 pagesSubjects: Applications (stat.AP); Statistics Theory (math.ST); Methodology (stat.ME)
An appeal for symmetry is made to build established notions of specific representation and specific nonlinearity of measurement (often called model error) into a canonical linear regression model. Additive components are derived from the trivially complete model M = m. Factor analysis and equation error motivate corresponding notions of representation and nonlinearity in an errorsinvariables framework, with a novel interpretation of terms. It is suggested that a modern interpretation of correlation involves both linear and nonlinear association.
Replacements for Tue, 19 Oct 21
 [13] arXiv:2008.08275 (replaced) [pdf, other]

Title: Asymptotic Analysis for DataDriven Inventory PoliciesSubjects: Statistics Theory (math.ST)
 [14] arXiv:2104.14023 (replaced) [pdf, ps, other]

Title: Measuring dependence between random vectors via optimal transportSubjects: Statistics Theory (math.ST)
 [15] arXiv:2106.09769 (replaced) [pdf, other]

Title: Generalized regression operator estimation for continuous time functional data processes with missing at random responseSubjects: Statistics Theory (math.ST); Methodology (stat.ME)
 [16] arXiv:1904.11060 (replaced) [pdf, ps, other]

Title: Normal Approximation in Large Network ModelsSubjects: Econometrics (econ.EM); Statistics Theory (math.ST)
 [17] arXiv:2003.10323 (replaced) [pdf, ps, other]

Title: Monte Carlo integration of nondifferentiable functions on $[0,1]^ι$, $ι=1,\dots,d$, using a single determinantal point pattern defined on $[0,1]^d$Subjects: Computation (stat.CO); Classical Analysis and ODEs (math.CA); Numerical Analysis (math.NA); Statistics Theory (math.ST)
 [18] arXiv:2007.04803 (replaced) [pdf, other]

Title: A Global Stochastic Optimization Particle Filter AlgorithmComments: 61 pages, 4 figuresSubjects: Machine Learning (stat.ML); Statistics Theory (math.ST); Computation (stat.CO)
 [19] arXiv:2008.11140 (replaced) [pdf, other]

Title: Powerful InferenceComments: 29 pages, 4 figures, 3 tablesSubjects: Econometrics (econ.EM); Statistics Theory (math.ST)
 [20] arXiv:2101.12353 (replaced) [pdf, other]

Title: On the capacity of deep generative networks for approximating distributionsSubjects: Machine Learning (cs.LG); Probability (math.PR); Statistics Theory (math.ST); Machine Learning (stat.ML)
 [21] arXiv:2104.04910 (replaced) [pdf, other]

Title: Semi$G$normal: a Hybrid between Normal and $G$normal (Full Version)Comments: 109 pages, 8 figures, a comprehensive document for conference and open discussions, to be divided later for publications, readers may navigate to the parts they are interested in by the table of contentsSubjects: Probability (math.PR); Statistics Theory (math.ST)
 [22] arXiv:2105.03425 (replaced) [pdf, other]

Title: Kernel TwoSample Tests for Manifold DataSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
 [23] arXiv:2105.08024 (replaced) [pdf, other]

Title: SampleEfficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited RevisitingSubjects: Machine Learning (cs.LG); Information Theory (cs.IT); Optimization and Control (math.OC); Statistics Theory (math.ST); Machine Learning (stat.ML)
 [24] arXiv:2106.03227 (replaced) [pdf, other]

Title: Neural Tangent Kernel Maximum Mean DiscrepancySubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
 [25] arXiv:2109.05578 (replaced) [pdf, other]

Title: Kernel PCA with the Nyström methodAuthors: Fredrik HallgrenComments: 44 pages, 6 figuresSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
 [26] arXiv:2109.11307 (replaced) [pdf, other]

Title: Semiparametric bivariate extremevalue copulasAuthors: Javier Fernández SerranoComments: 23 pages, 22 figuresSubjects: Methodology (stat.ME); Statistics Theory (math.ST); Computation (stat.CO)
 [27] arXiv:2110.01593 (replaced) [pdf, other]

Title: Generalized Kernel ThinningSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST); Methodology (stat.ME)
[ showing up to 2000 entries per page: fewer  more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, math, recent, 2110, contact, help (Access key information)