# 2024 Spring Group Meeting

** Published:**

Paper list and Scheldue for spring group meeting.

### Scheldue

- 5/27 Simulation Free algorithms (Diffusion) and Stochastic Control
- 6/2 Overall of Yiping’s research and Organization metting
- 6/9 Mean-Field Lagevin for feature learning
- 6/16 Reproducing Kernel Space and Statistical Query Bounds
- 6/23 Why does the two-timescale Q-learning converge to different mean field solutions? A unified convergence analysis by Jing An
- 6/30 Optimal Stable Nonlinear Approximation/Scaling Law
- 7/7 Hybrid Scientific computing and Machine learning
- 7/14 Optimal Rates of Kernel Ridge Regression under Source Condition in Large Dimensions by Haobo Zhang
- 7/21 Variational Monte Carlo

### Paper List

Other good reading group: ML Thoery @ GT

#### Techiques

- The CLT in high dimensions: quantitative bounds via martingale embedding
- Statistical algorithms and a lower bound for detecting planted cliques [My Note]
- Localization schemes: A framework for proving mixing bounds for Markov chains
- Dualizing Le Cam’s method for functional estimation, with applications to estimating the unseens
- Optimal Stable Nonlinear Approximation, Optimal Learning
- The Covering Number in LearningTheory, Capacity of Reproducing Kernel Spaces in Learning Theory, Covering numbers of Gaussian reproducing kernel Hilbert spaces

#### Machine Learning Theory:

- kernel ridge regression inference
- Localization, convexity, and star aggregation
- Computational-Statistical Gaps in Gaussian Single-Index Models
- Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension
- Adaptive learning rates for support vector machines working on data with low intrinsic dimension
- Which Spaces can be Embedded in Reproducing Kernel Hilbert Spaces?
- How Transformers Learn Causal Structure with Gradient Descent
- Saddle-to-Saddle Dynamics in Diagonal Linear Networks
- Learning time-scales in two-layers neural networks
- A duality framework for generalization analysis of random feature models and two-layer neural networks
- Generalization in kernel regression under realistic assumptions
- Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum
- Residual Alignment: Uncovering the Mechanisms of Residual Networks
- Stochastic Localization via Iterative Posterior Sampling
- The Sample Size Required in importance sampling
- Regularized DeepIV with Model Selection
- Statistical indistinguishability of learning algorithms
- Majority-of-Three: The Simplest Optimal Learner?
- The fundamental limits of structure-agnostic functional estimation
- Detection in the stochastic block model with multiple clusters: proof of the achievability conjectures, acyclic BP, and the information-computation gap
- Optimal Rates of Kernel Ridge Regression under Source Condition in Large Dimensions
- When do exact and powerful p-values and e-values exist?
- A Tale of Tails: Model Collapse as a Change of Scaling Laws
- Understanding LLMs Requires More Than Statistical Generalization
- The Perception-Distortion Tradeoff, A Theory of the Distortion-Perception Tradeoff in Wasserstein Space

#### Machine Learning

- Unifying Generative Models with GFlowNets and Beyond
- Fractal Patterns May Unravel the Intelligence in Next-Token Prediction
- Is Cosine-Similarity of Embeddings Really About Similarity?
- Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training
- Score-based Causal Representation Learning with Interventions
- Causal Modeling with Stationary Diffusions
- Measure transport with kernel mean embeddings
- Closed-form Filtering for Non-linear Systems
- Batch and match: black-box variational inference with a score-based divergence
- FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
- Statistical-Computational Trade-offs in Tensor PCA and Related Problems via Communication Complexity
- Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
- Observational Scaling Laws and the Predictability of Language Model Performance

#### Stochastic simulation and stochastic control

- Can local particle filters beat the curse of dimensionality?
- Stochastic Optimal Control Matching
- Sinkhorn Flow: A Continuous-Time Framework for Understanding and Generalizing the Sinkhorn Algorithm
- Diffusion Schrödinger bridge matching
- multilevel picard iterations for solving smooth semilinear parabolic heat equations
- A survey of the Schrödinger problem and some of its connections with optimal transport
- Understanding the training of infinitely deep and wide ResNets with Conditional Optimal Transport
- The large deviation approach to statistical mechanics
- Shifted Composition I: Harnack and Reverse Transport Inequalities
- Geometry and analytic properties of the sliced Wasserstein space (section 5)
- Diffusion copulas: Identification and estimation. JOE 2021.