# Publications

^{2}) resource in space and time. The HD algorithm overcomes this O(n

^{2}) bottleneck by using the principle of deferred decisions: rather than fixing the entire random matrix in advance, it lets the randomness unfold with the dynamics. At the heart of this matrix-free algorithm is an adaptive and recursive construction of (random) Householder reflectors. These orthogonal transformations exploit the group symmetry of the matrix ensembles, while simultaneously maintaining the statistical correlations induced by the dynamics. The memory and computation costs of the HD algorithm are O(nT) and O(nT

^{2}), respectively, with T being the number of iterations. When T≪n, which is nearly always the case in practice, the new algorithm leads to significant reductions in runtime and memory footprint. Numerical results demonstrate the promise of the HD algorithm as a new computational tool in the study of high-dimensional random systems.

We study the problem of detecting a random walk on a graph from a sequence of noisy measurements at every node. There are two hypotheses: either every observation is just meaningless zero-mean Gaussian noise, or at each time step exactly one node has an elevated mean, with its location following a random walk on the graph over time. We want to exploit knowledge of the graph structure and random walk parameters (specified by a Markov chain transition matrix) to detect a possibly very weak signal. The optimal detector is easily derived, and we focus on the harder problem of characterizing its performance through the (type-II) error exponent: the decay rate of the miss probability under a false alarm constraint.

The expression for the error exponent resembles the free energy of a spin glass in statistical physics, and we borrow techniques from that field to develop a lower bound. Our fully rigorous analysis uses large deviations theory to show that the lower bound exhibits a phase transition: strong performance is only guaranteed when the signal-to-noise ratio exceeds twice the entropy rate of the random walk.

Monte Carlo simulations show that the lower bound fully captures the behavior of the true exponent.

We study a spectral initialization method that serves as a key ingredient in recent work on using efficient iterative algorithms for estimating signals in nonconvex settings. Unlike previous analysis in the literature, which is restricted to the phase retrieval setting and which provides only performance bounds, we consider arbitrary generalized linear sensing models and present a precise asymptotic characterization of the performance of the spectral method in the high-dimensional regime. Our analysis reveals a phase transition phenomenon that depends on the sampling ratio. When the ratio is below a minimum threshold, the estimates given by the spectral method are no better than a random guess drawn uniformly from the hypersphere; above a maximum threshold, however, the estimates become increasingly aligned with the target signal. The computational complexity of the spectral method is also markedly different in the two phases. Worked examples and numerical results are provided to illustrate and verify the analytical predictions. In particular, simulations show that our asymptotic formulas provide accurate predictions even at moderate signal dimensions.

We present the optimal design of a spectral method widely used to initialize nonconvex optimization algorithms for solving phase retrieval and other signal recovery problems. Our work leverages recent results that provide an exact characterization of the performance of the spectral method in the high-dimensional limit. This characterization allows us to map the task of optimal design to a constrained optimization problem in a weighted $L^2$ function space. The latter has a closed-form solution. Interestingly, under a mild technical condition, our results show that there exists a fixed design that is uniformly optimal over all sampling ratios. Numerical simulations demonstrate the performance improvement brought by the proposed optimal design over existing constructions in the literature. In a recent work, Mondelli and Montanari have shown the existence of a weak reconstruction threshold below which the spectral method cannot provide useful estimates. Our results serve to complement that work by deriving the fundamental limit of the spectral method beyond the aforementioned threshold.

Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional wisdom often takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple iterative methods such as gradient descent have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently.

In this tutorial-style overview, we highlight the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees. We review two contrasting approaches: (1) two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and (2) global landscape analysis and initialization-free algorithms. Several canonical matrix factorization problems are discussed, including but not limited to matrix sensing, phase retrieval, matrix completion, blind deconvolution, robust principal component analysis, phase synchronization, and joint alignment. Special care is taken to illustrate the key technical insights underlying their analyses. This article serves as a testament that the integrated thinking of optimization and statistics leads to fruitful research findings.

In sparse linear regression, the SLOPE estimator generalizes LASSO by assigning magnitude-dependent regular- izations to different coordinates of the estimate. In this paper, we present an asymptotically exact characterization of the performance of SLOPE in the high-dimensional regime where the number of unknown parameters grows in proportion to the number of observations. Our asymptotic characterization enables us to derive optimal regularization sequences to either minimize the MSE or to maximize the power in variable selection under any given level of Type-I error. In both cases, we show that the optimal design can be recast as certain infinite-dimensional convex optimization problems, which have efficient and accurate finite-dimensional approximations. Numerical simulations verify our asymptotic predictions. They also demonstrate the superi- ority of our optimal design over LASSO and a regularization sequence previously proposed in the literature.