# Publications by Type: Journal Article

^{2}) resource in space and time. The HD algorithm overcomes this O(n

^{2}) bottleneck by using the principle of deferred decisions: rather than fixing the entire random matrix in advance, it lets the randomness unfold with the dynamics. At the heart of this matrix-free algorithm is an adaptive and recursive construction of (random) Householder reflectors. These orthogonal transformations exploit the group symmetry of the matrix ensembles, while simultaneously maintaining the statistical correlations induced by the dynamics. The memory and computation costs of the HD algorithm are O(nT) and O(nT

^{2}), respectively, with T being the number of iterations. When T≪n, which is nearly always the case in practice, the new algorithm leads to significant reductions in runtime and memory footprint. Numerical results demonstrate the promise of the HD algorithm as a new computational tool in the study of high-dimensional random systems.

We study the problem of detecting a random walk on a graph from a sequence of noisy measurements at every node. There are two hypotheses: either every observation is just meaningless zero-mean Gaussian noise, or at each time step exactly one node has an elevated mean, with its location following a random walk on the graph over time. We want to exploit knowledge of the graph structure and random walk parameters (specified by a Markov chain transition matrix) to detect a possibly very weak signal. The optimal detector is easily derived, and we focus on the harder problem of characterizing its performance through the (type-II) error exponent: the decay rate of the miss probability under a false alarm constraint.

The expression for the error exponent resembles the free energy of a spin glass in statistical physics, and we borrow techniques from that field to develop a lower bound. Our fully rigorous analysis uses large deviations theory to show that the lower bound exhibits a phase transition: strong performance is only guaranteed when the signal-to-noise ratio exceeds twice the entropy rate of the random walk.

Monte Carlo simulations show that the lower bound fully captures the behavior of the true exponent.

We study a spectral initialization method that serves as a key ingredient in recent work on using efficient iterative algorithms for estimating signals in nonconvex settings. Unlike previous analysis in the literature, which is restricted to the phase retrieval setting and which provides only performance bounds, we consider arbitrary generalized linear sensing models and present a precise asymptotic characterization of the performance of the spectral method in the high-dimensional regime. Our analysis reveals a phase transition phenomenon that depends on the sampling ratio. When the ratio is below a minimum threshold, the estimates given by the spectral method are no better than a random guess drawn uniformly from the hypersphere; above a maximum threshold, however, the estimates become increasingly aligned with the target signal. The computational complexity of the spectral method is also markedly different in the two phases. Worked examples and numerical results are provided to illustrate and verify the analytical predictions. In particular, simulations show that our asymptotic formulas provide accurate predictions even at moderate signal dimensions.

We present the optimal design of a spectral method widely used to initialize nonconvex optimization algorithms for solving phase retrieval and other signal recovery problems. Our work leverages recent results that provide an exact characterization of the performance of the spectral method in the high-dimensional limit. This characterization allows us to map the task of optimal design to a constrained optimization problem in a weighted $L^2$ function space. The latter has a closed-form solution. Interestingly, under a mild technical condition, our results show that there exists a fixed design that is uniformly optimal over all sampling ratios. Numerical simulations demonstrate the performance improvement brought by the proposed optimal design over existing constructions in the literature. In a recent work, Mondelli and Montanari have shown the existence of a weak reconstruction threshold below which the spectral method cannot provide useful estimates. Our results serve to complement that work by deriving the fundamental limit of the spectral method beyond the aforementioned threshold.

Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional wisdom often takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple iterative methods such as gradient descent have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently.

In this tutorial-style overview, we highlight the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees. We review two contrasting approaches: (1) two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and (2) global landscape analysis and initialization-free algorithms. Several canonical matrix factorization problems are discussed, including but not limited to matrix sensing, phase retrieval, matrix completion, blind deconvolution, robust principal component analysis, phase synchronization, and joint alignment. Special care is taken to illustrate the key technical insights underlying their analyses. This article serves as a testament that the integrated thinking of optimization and statistics leads to fruitful research findings.

Sparse coding refers to the pursuit of the sparsest representation of a signal in a typically overcomplete dictionary. From a Bayesian perspective, sparse coding provides a Maximum a Posteriori (MAP) estimate of the unknown vector under a sparse prior. Various nonlinear algorithms are available to approximate the solution of such problems.

In this work, we suggest enhancing the performance of sparse coding algorithms by a deliberate and controlled contamination of the input with random noise, a phenomenon known as stochastic resonance. This not only allows for increased performance, but also provides a computationally efficient approximation to the Minimum Mean Square Error (MMSE) estimator, which is ordinarily intractable to compute. We demonstrate our findings empirically and provide a theoretical analysis of our method under several different cases.

Finding the sparse representation of a signal in an overcomplete dictionary has attracted a lot of attention over the past years. This paper studies ProSparse, a new polynomial complexity algorithm that solves the sparse representation problem when the underlying dictionary is the union of a Vandermonde matrix and a banded matrix. Unlike our previous work which establishes deterministic (worst-case) sparsity bounds for ProSparse to succeed, this paper presents a probabilistic average-case analysis of the algorithm. Based on a generating-function approach, closed-form expressions for the exact success probabilities of ProSparse are given. The success probabilities are also analyzed in the high-dimensional regime. This asymptotic analysis characterizes a sharp phase transition phenomenon regarding the performance of the algorithm.

We propose an image representation scheme combining the local and nonlocal characterization of patches in an image. Our representation scheme can be shown to be equivalent to a tight frame constructed from convolving local bases (e.g., wavelet frames, discrete cosine transforms, etc.) with nonlocal bases (e.g., spectral basis induced by nonlinear dimension reduction on patches), and we call the resulting frame elements convolution framelets. Insight gained from analyzing the proposed representation leads to a novel interpretation of a recent high-performance patch-based image pro- cessing algorithm using the point integral method (PIM) and the low dimensional manifold model (LDMM) [S. Osher, Z. Shi, and W. Zhu, Low Dimensional Manifold Model for Image Processing, Tech. Rep., CAM report 16-04, UCLA, Los Angeles, CA, 2016]. In particular, we show that LDMM is a weighted l2-regularization on the coefficients obtained by decomposing images into linear combinations of convolution framelets; based on this understanding, we extend the original LDMM to a reweighted version that yields further improved results. In addition, we establish the energy concentration property of convolution framelet coefficients for the setting where the local basis is constructed from a given nonlocal basis via a linear reconstruction framework; a generalization of this framework to unions of local embeddings can provide a natural setting for interpreting BM3D, one of the state-of-the-art image denoising algorithms.

Many patch-based image denoising algorithms can be formulated as applying a smoothing filter to the noisy image. Expressed as matrices, the smoothing filters must be row normalized so that each row sums to unity. Surprisingly, if we apply a column normalization before the row normalization, the performance of the smoothing filter can often be significantly improved. Prior works showed that such performance gain is related to the Sinkhorn-Knopp balancing algorithm, an iterative procedure that symmetrizes a row-stochastic matrix to a doubly-stochastic matrix. However, a complete understanding of the performance gain phenomenon is still lacking.

In this paper, we study the performance gain phenomenon from a statistical learning perspective. We show that Sinkhorn-Knopp is equivalent to an Expectation-Maximization (EM) algorithm of learning a Product of Gaussians (PoG) prior of the image patches. By establishing the correspondence between the steps of Sinkhorn-Knopp and the EM algorithm, we provide a geometrical interpretation of the symmetrization process. The new PoG model also allows us to develop a new denoising algorithm called Product of Gaussian Non-Local-Means (PoG-NLM). PoG-NLM is an extension of the Sinkhorn-Knopp and is a generalization of the classical non-local means. Despite its simple formulation, PoG-NLM outperforms many existing smoothing filters and has a similar performance compared to BM3D.