Ankit Pensia

Postdoctoral Fellow, IBM Research

Hi! I am a Herman Goldstine Postdoctoral Fellow at the MIT-IBM Watson AI Lab. My research interests include robust and heavy-tailed statistics, inference under constraints, and machine learning and high-dimensional statistics.

I received my Ph.D. from the Computer Sciences department at UW-Madison in 2023, where I was advised by Po-Ling Loh, Varun Jog, and Ilias Diakonikolas. My dissertation may be found at this link (recipient of Graduate Student Research Award from the CS Department). I also interned at Google Research NYC with Pranjal Awasthi and Satyen Kale. Prior to Madison, I spent five memorable years at IIT Kanpur.

Please feel free to contact me at ankitpensia94@gmail.com.

# Selected Publications (Full list, Google Scholar, dblp, Semantic Scholar)

### Robust and Heavy-Tailed Statistics

Outliers and heavy-tailed distributions pose significant challenges to standard inference procedures and are studied in the field of *robust statistics*. I am interested in exploring the statistical and computational landscapes of such algorithms.

- A Sub-Quadratic Time Algorithm for Robust Sparse Mean Estimation [link] [abstract]

**ICML**, 2024 (**Spotlight**)

Existing algorithms for robust sparse estimation ran in $\Omega(d^2)$ time, where $d$ is the dimension. Thus, while sparsity improves the statistical performance (in terms of sample complexity), these benefits do not translate into practice because of high computational cost. We developed the first subquadratic algorithm. For an introduction to this problem (and this result), please see the slides of this survey talk.

- Outlier Robust Mean Estimation with Subgaussian Rates via Stability [link] [abstract]

**NeurIPS**, 2020

We show that recent outlier-robust algorithms also achieve subgaussian confidence intervals for heavy-tailed distributions, and vice-versa. In particular, we identify the "stability" condition as the bridge between these two contamination models. In a recent paper at NeurIPS 2022, we extended these results to robust*sparse*mean estimation, i.e., when the mean is is sparse.

### Inference under Communication, Memory, and Privacy Constraints

The proliferation of big data has led to distributed inference paradigms such as federated learning, which impose constraints on communication bandwidth, memory usage, and privacy. My research focuses on understanding the impact of these constraints.

- Simple Binary Hypothesis Testing under Local Differential Privacy and Communication Constraints [link] [abstract]

**COLT**, 2023

We characterize the minmax optimal sample complexity of hypothesis testing under local differential privacy and communication constraints, develop instance-optimal algorithms, and show separation between the sample complexity for binary and ternary distributions. In this paper, we build on our recent prior work, where we investigated the cost of only the communication constraints.

- Streaming Algorithms for High-Dimensional Robust Statistics [link] [abstract]

**ICML**, 2022

We develop the first (computationally-efficient and sample-efficient) streaming algorithm with $o(d^2)$ memory usage for a variety of robust estimation tasks; in fact, our algorithm uses only $\tilde{O}(d)$ space. All the prior robust algorithms needed to store the entire dataset in memory, which leads to quadratic memory. In a recent paper at ICML 2023, we extended it to robust PCA.

### Machine Learning and Statistics

I have a broad interest in the fields of machine learning and statistics.

- The Sample Complexity of Simple Binary Hypothesis Testing [link] [abstract]

**COLT**, 2024

We revisit the problem of simple binary hypothesis testing, a fundamental problem in statistics. Despite being studied for a century, the non-asymptotic rates of this problem were not known (in contrast, the asymptotic error rates were very well-understood). We close this gap and derive tight sample complexity results.

- Generalization Error Bounds for Noisy, Iterative Algorithms [link] [abstract]

**ISIT**, 2018

We provide a general recipe to bound generalization error of noisy, iterative algorithms using information-theoretic tools. One prominent example of such algorithms is Stochastic Gradient Langevin Dynamics (SGLD).

## Teaching Experience

Graduate Teaching Assistant

Mathematical Foundations of Machine Learning, UW-Madison, Fall 2018

Introduction to Bioinformatics, UW-Madison, Fall 2017

Probabilistic Mobile Robotics, IIT Kanpur, Fall 2016, Spring 2016

Communication Systems, IIT Kanpur, Summer 2016

Academic Mentor

Representation and Analysis of Random Signals, IIT Kanpur, Fall 201