Publications

2017
R. Yin, T. Gao, Y. M. Lu, and I. Daubechies, “A Tale of Two Bases: Local-Nonlocal Regularization on Image Patches with Convolution Framelets,” SIAM Journal on Imaging Sciences, vol. 10, no. 2, pp. 711-750, 2017.Abstract

We propose an image representation scheme combining the local and nonlocal characterization of patches in an image. Our representation scheme can be shown to be equivalent to a tight frame constructed from convolving local bases (e.g., wavelet frames, discrete cosine transforms, etc.) with nonlocal bases (e.g., spectral basis induced by nonlinear dimension reduction on patches), and we call the resulting frame elements convolution framelets. Insight gained from analyzing the proposed representation leads to a novel interpretation of a recent high-performance patch-based image pro- cessing algorithm using the point integral method (PIM) and the low dimensional manifold model (LDMM) [S. Osher, Z. Shi, and W. Zhu, Low Dimensional Manifold Model for Image Processing, Tech. Rep., CAM report 16-04, UCLA, Los Angeles, CA, 2016]. In particular, we show that LDMM is a weighted l2-regularization on the coefficients obtained by decomposing images into linear combinations of convolution framelets; based on this understanding, we extend the original LDMM to a reweighted version that yields further improved results. In addition, we establish the energy concentration property of convolution framelet coefficients for the setting where the local basis is constructed from a given nonlocal basis via a linear reconstruction framework; a generalization of this framework to unions of local embeddings can provide a natural setting for interpreting BM3D, one of the state-of-the-art image denoising algorithms. 

m109144.pdf
S. H. Chan, T. Zickler, and Y. M. Lu, “Understanding Symmetric Smoothing Filters: A Gaussian Mixture Model Perspective,” IEEE Transactions on Image Processing, vol. 26, no. 11, pp. 5107-5121, 2017. arXiv:1601.00088Abstract

Many patch-based image denoising algorithms can be formulated as applying a smoothing filter to the noisy image. Expressed as matrices, the smoothing filters must be row normalized so that each row sums to unity. Surprisingly, if we apply a column normalization before the row normalization, the performance of the smoothing filter can often be significantly improved. Prior works showed that such performance gain is related to the Sinkhorn-Knopp balancing algorithm, an iterative procedure that symmetrizes a row-stochastic matrix to a doubly-stochastic matrix. However, a complete understanding of the performance gain phenomenon is still lacking.

In this paper, we study the performance gain phenomenon from a statistical learning perspective. We show that Sinkhorn-Knopp is equivalent to an Expectation-Maximization (EM) algorithm of learning a Product of Gaussians (PoG) prior of the image patches. By establishing the correspondence between the steps of Sinkhorn-Knopp and the EM algorithm, we provide a geometrical interpretation of the symmetrization process. The new PoG model also allows us to develop a new denoising algorithm called Product of Gaussian Non-Local-Means (PoG-NLM). PoG-NLM is an extension of the Sinkhorn-Knopp and is a generalization of the classical non-local means. Despite its simple formulation, PoG-NLM outperforms many existing smoothing filters and has a similar performance compared to BM3D.

2016
C. Wang and Y. M. Lu, “Online Learning for Sparse PCA in High Dimensions: Exact Dynamics and Phase Transitions,” in IEEE Information Theory Workshop (ITW), 2016.Abstract

We study the dynamics of an online algorithm for learning a sparse leading eigenvector from samples generated from a spiked covariance model. This algorithm combines the classical Oja's method for online PCA with an element-wise nonlinearity at each iteration to promote sparsity. In the high-dimensional limit, the joint empirical measure of the underlying sparse eigenvector and its estimate provided by the algorithm is shown to converge weakly to a deterministic, measure-valued process. This scaling limit is characterized as the unique solution of a nonlinear PDE, and it provides exact information regarding the asymptotic performance of the algorithm. For example, performance metrics such as the cosine similarity and the misclassification rate in sparse support recovery can be obtained by examining the limiting dynamics. A steady-state analysis of the nonlinear PDE also reveals an interesting phase transition phenomenon. Although our analysis is asymptotic in nature, numerical simulations show that the theoretical predictions are accurate for moderate signal dimensions.

spca.pdf
J. Onativia, P. L. Dragotti, and Y. M. Lu, “ProSparse denoise: Prony's based sparsity pattern recovery in the presence of noise,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016.
Y. Chi and Y. M. Lu, “Kaczmarz Method for Solving Quadratic Equations,” IEEE Signal Processing Letters, vol. 23, no. 9, pp. 1183-1187, 2016.Abstract

Estimating low-rank positive-semidefinite (PSD) matrices from symmetric rank-one measurements is of great importance in many applications, such as high-dimensional data processing, quantum state tomography, and phase retrieval. When the rank is known a priori, this problem can be regarded as solving a system of quadratic equations of a low-dimensional subspace. The authors develop a fast iterative algorithm based on an adaptation of the Kaczmarz method, which is traditionally used for solving overdetermined linear systems. In particular, the authors characterize the dynamics of the algorithm when the measurement vectors are composed of standard Gaussian entries in the online setting. Numerical simulations demonstrate the compelling performance of the proposed algorithm.

kaczmarz_quadratic_final.pdf
F. Sroubek, J. Kamenicky, and Y. M. Lu, “Decomposition space-variant blur in image deconvolution,” IEEE Signal Processing Letters, vol. 23, no. 3, pp. 346-350, 2016.Abstract

Standard convolution as a model of radiometric degradation is in majority of cases inaccurate as the blur varies in space and we are thus required to work with a computationally demanding space-variant model. Space-variant degradation can be approximately decomposed to a set of standard convolutions. We explain in detail the properties of the space-variant degrada- tion operator and show two possible decomposition models and two approximation approaches. Our target application is space- variant image deconvolution, on which we illustrate theoretical differences between these models. We propose a computationally efficient restoration algorithm that belongs to a category of alternating direction methods of multipliers, which consists of four update steps with closed-form solutions. Depending on the used decomposition, two variations of the algorithm exist with distinct properties. We test the effectiveness of the decomposition models under different levels of approximation on synthetic and real examples, and conclude the letter by drawing several practical observations. 

main_rev.pdf
D. M. Merfeld, T. K. Clark, Y. M. Lu, and F. Karmali, “Dynamics of Individual Perceptual Decisions,” Journal of Neurophysiology, vol. 115, no. 1, 2016. Publisher's VersionAbstract

Perceptual decision-making is fundamental to a broad range of fields including neurophysiology, economics, medicine, advertising, law, etc. While recent findings have yielded major advances in our understanding of perceptual decision-making, decision-making as a function of time and frequency (i.e., decision-making dynamics) is not well understood. To limit the review length, we focus most of this review on human findings. Animal findings, which are extensively reviewed elsewhere, are included when beneficial or necessary. We attempt to put these various findings and datasets - which can appear to be unrelated in the absence of a formal dynamic analysis - into context using published models. Specifically, by adding appropriate dynamic mechanisms (e.g., high-pass filters) to existing models, it appears that a number of otherwise seemingly disparate findings from the literature might be explained. One hypothesis that arises through this dynamic analysis is that decision-making includes phasic (high-pass) neural mechanisms, an evidence accumulator and/or some sort of mid-trial decision-making mechanism (e.g., peak detector and/or decision boundary).

C. Hu, J. Sepulcre, K. A. Johnson, G. E. Fakhri, Y. M. Lu, and Q. Li, “Matched Signal Detection on Graphs: Theory and Application to Brain Imaging Data Classification,” NeuroImage, vol. 125, pp. 587-600, 2016.Abstract

Motivated by recent progresses in signal processing on graphs, we have developed a matched signal detection (MSD) theory for signals with intrinsic structures described by weighted graphs. First, we regard graph Laplacian eigenvalues as frequencies of graph-signals and assume that the signal is in a subspace spanned by the first few graph Laplacian eigenvectors associated with lower eigenvalues. The conventional matched subspace detector can be applied to this case. Furthermore, we study signals that may not merely live in a subspace. Namely, we consider signals with bounded variation on graphs and more general signals that are randomly drawn from a prior distribution. For bounded variation signals, the test is a weighted energy detector. For the random signals, the test statistic is the difference of signal variations on associated graphs, if a degenerate Gaussian distribution specified by the graph Laplacian is adopted. We evaluate the effectiveness of the MSD on graphs both on simulated and real data sets. Specifically, we apply MSD to the brain imaging data classification problem of Alzheimer’s disease (AD) based on two independent data sets: 1) positron emission tomography data with Pittsburgh compound-B tracer of 30 AD and 40 normal control (NC) subjects, 2) resting-state functional magnetic resonance imaging (R-fMRI) data of 30 early mild cognitive impairment and 20 NC subjects. Our results demonstrate that the MSD approach is able to outperform the traditional methods and help detect AD at an early stage, probably due to the success of exploiting the manifold structure of the data. 

A. Minot, Y. M. Lu, and N. Li, “A Distributed Gauss-Newton Method for Power System State Estimation,” IEEE Transactions on Power Systems, vol. 31, no. 5, pp. 3804-3815, 2016.Abstract

We propose a fully distributed Gauss-Newton algorithm for state estimation of electric power systems. At each Gauss-Newton iteration, matrix-splitting techniques are utilized to carry out the matrix inversion needed for calculating the Gauss-Newton step in a distributed fashion. In order to reduce the communication burden as well as increase robustness of state estimation, the proposed distributed scheme relies only on local information and a limited amount of information from neighboring areas. The matrix-splitting scheme is designed to calculate the Gauss-Newton step with exponential convergence speed. The effectiveness of the method is demonstrated in various numerical experiments.

I. Dokmanic and Y. M. Lu, “Sampling Sparse Signals on the Sphere: Algorithms and Applications,” IEEE Transactions on Signal Processing, vol. 64, no. 1, pp. 189-202, 2016. arXiv:1502.07577Abstract

We propose a sampling scheme that can perfectly reconstruct a collection of
spikes on the sphere from samples of their lowpass-filtered observations.
Central to our algorithm is a generalization of the annihilating filter
method, a tool widely used in array signal processing and finite-rate-of-innovation
(FRI) sampling. The proposed algorithm can reconstruct $K$ spikes
from $(K+\sqrt{K})^2$ spatial samples---a sampling requirement that
improves over known sparse sampling schemes on the sphere by a factor of up
to four.

We showcase the versatility of the proposed algorithm by applying it to
three different problems: 1) sampling diffusion processes induced by
localized sources on the sphere, 2) shot-noise removal, and 3) sound source
localization (SSL) by a spherical microphone array. In particular, we show
how SSL can be reformulated as a spherical sparse sampling problem.

2015
G. Li, Y. Gu, and Y. M. Lu, “Phase Retrieval Using Iterative Projections: Dynamics in the Large Systems Limit,” in Allerton Conference on Communications, Control, and Computing, 2015.Abstract

We study phase retrieval by using a simple non-convex algorithm based on iterative projections. Our main contribution in this work is to establish an exact analysis of the dynamics of the algorithm in an online setting, with Gaussian measurement vectors. We show that the algorithm dynamics, measured by the squared distance between the current estimate and the true solution, can be fully characterized by a 2D Markov process, irrespective of the underlying signal dimension. Furthermore, in the large systems limit (i.e., as the signal dimension tends to infinity), the random sample paths of the 2D Markov process converge to the solutions of two deterministic and coupled ordinary differential equations (ODEs). Numerical simulations verify the accuracy of our analytical predictions, even for moderate system sizes. This suggests that the ODE approach presented in this work provides an effective tool for analyzing the performance and convergence of the algorithm.

phase_dynamics.pdf
S. H. Chan, T. Zickler, and Y. M. Lu, “Understanding symmetric smoothing filters via Gaussian mixtures,” in IEEE International Conference on Image Processing, 2015.
J. Onativia, P. L. Dragotti, and Y. M. Lu, “Sparsity according to Prony, Average Performance Analysis,” in Signal Processing with Adaptive Sparse Structured Representations (SPARS) Workshop, Cambridge, England, 2015. 2015_spars.pdf
C. Hu, et al., “A Spectral Graph Regression Model for Learning Brain Connectivity of Alzheimer's Disease,” PLOS ONE, vol. 10, no. 5, 2015. Publisher's VersionAbstract

Understanding network features of brain pathology is essential to reveal underpinnings of neurodegenerative diseases. In this paper, we introduce a novel graph regression model (GRM) for learning structural brain connectivity of Alzheimer’s disease (AD) measured by amyloid-β  deposits. The proposed GRM regards 11 C-labeled Pittsburgh Compound-B (PiB) positron emission tomography (PET) imaging data as smooth signals defined on an unknown graph. This graph is then estimated through an optimization framework, which fits the graph to the data with an adjustable level of uniformity of the connection weights. Under the assumed data model, results based on simulated data illustrate that our approach can accurately reconstruct the underlying network, often with better reconstruction than those obtained by both sample correlation and ℓ1 -regularized partial correlation estimation. Evaluations performed upon PiB-PET imaging data of 30 AD and 40 elderly normal control (NC) subjects demonstrate that the connectivity patterns revealed by the GRM are easy to interpret and consistent with known pathology. Moreover, the hubs of the reconstructed networks match the cortical hubs given by functional MRI. The discriminative network features including both global connectivity measurements and degree statistics of specific nodes discovered from the AD and NC amyloid-beta networks provide new potential biomarkers for preclinical and clinical AD.

A. Agaskar and Y. M. Lu, “Optimal Detection of Random Walks on Graphs: Performance Analysis via Statistical Physics,” Technical Report, 2015. arXiv:1504.06924Abstract

We study the problem of detecting a random walk on a graph from a sequence of noisy measurements at every node. There are two hypotheses: either every observation is just meaningless zero-mean Gaussian noise, or at each time step exactly one node has an elevated mean, with its location following a random walk on the graph over time. We want to exploit knowledge of the graph structure and random walk parameters (specified by a Markov chain transition matrix) to detect a possibly very weak signal. The optimal detector is easily derived, and we focus on the harder problem of characterizing its performance through the (type-II) error exponent: the decay rate of the miss probability under a false alarm constraint.
The expression for the error exponent resembles the free energy of a spin glass in statistical physics, and we borrow techniques from that field to develop a lower bound. Our fully rigorous analysis uses large deviations theory to show that the lower bound exhibits a phase transition: strong performance is only guaranteed when the signal-to-noise ratio exceeds twice the entropy rate of the random walk.
Monte Carlo simulations show that the lower bound fully captures the behavior of the true exponent.

2014
A. Agaskar and Y. M. Lu, “Optimal hypothesis testing with combinatorial structure: Detecting random walks on graphs,” in Proc. of Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, 2014.Abstract

Suppose we have a time series of observations for each node in a network, and we wish to detect the presence of a particle undergoing a random walk on the network. If there is no such particle, then we observe only zero-mean Gaussian noise. If present, however, the location of the particle has an elevated mean. How well can we detect the particle at low signal-to-noise ratios? This is a special case of the problem of detecting hidden Markov processes (HMPs).

The performance metric we analyze is the error exponent of the optimal detector, which measures the exponential rate of decay in the miss probability if the false alarm probability is held fixed as the observation time increases. This problem exhibits deep connections to a problem in statistical physics: computing the free energy density of a spin glass. 

We develop a generalized version of the random energy model (REM) spin glass, whose free energy density provides a lower bound for our error exponent, and compute the bound using large deviations techniques. The bound closely matches empirical results in numerical experiments, and suggests a phase transition phenomenon: below a threshold SNR, the error exponent is nearly constant and near zero, indicating poor performance; above the threshold, there is rapid improvement in performance as the SNR increases. The location of the phase transition depends on the entropy rate of the Markov process.

S. H. Chan and Y. M. Lu, “Efficient image reconstruction for gigapixel quantum image sensors,” in IEEE Global Conference on Signal and Information Processing (GlobalSIP), Atlanta, GA, 2014.Abstract

Recent advances in materials, devices and fabrication technologies have motivated a strong momentum in developing solid-state sensors that can detect individual photons in space and time. It has been envisioned that such sensors can eventually achieve very high spatial resolutions (e.g., 10^9 pixels/chip) as well as high frame rates (e.g., 10^6 frames/sec). In this paper, we present an efficient algorithm to reconstruct images from the massive binary bit-streams generated by these sensors. Based on the concept of alternating direction method of multipliers (ADMM), we transform the computationally intensive optimization problem into a sequence of subproblems, each of which has efficient implementations in the form of polyphase-domain filtering or pixel-wise nonlinear mappings. Moreover, we reformulate the original maximum likelihood estimation as maximum a posterior estimation by introducing a total variation prior. Numerical results demonstrate the strong performance of the proposed method, which achieves several dB’s of improvement in PSNR and requires a shorter runtime as compared to standard gradient-based approaches.

qis_image_reconstruction.pdf
A. Agaskar, C. Wang, and Y. M. Lu, “Randomized Kaczmarz algorithms: Exact MSE analysis and optimal sampling probabilities,” in IEEE Global Conference on Signal and Information Processing (GlobalSIP), Atlanta, GA, 2014.Abstract

The Kaczmarz method, or the algebraic reconstruction technique (ART), is a popular method for solving large-scale overdetermined systems of equations. Recently, Strohmer et al. proposed the randomized Kaczmarz algorithm, an improvement that guarantees exponential convergence to the solution. This has spurred much interest in the algorithm and its extensions. We provide in this paper an exact formula for the mean squared error (MSE) in the value reconstructed by the algorithm. We also compute the exponential decay rate of the MSE, which we call the “annealed” error exponent. We show that the typical performance of the algorithm is far better than the average performance. We define the “quenched” error exponent to characterize the typical performance. This is far harder to compute than the annealed error exponent, but we provide an approximation that matches empirical results. We also explore optimizing the algorithm’s row-selection probabilities to speed up the algorithm’s convergence.

randkac_globalsip14.pdf

(This paper received the Best Student Paper Award of GlobalSIP)

J. Oñativia, Y. M. Lu, and P. L. Dragotti, “Finite Dimensional FRI,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, 2014.Abstract

Traditional Finite Rate of Innovation (FRI) theory has considered the problem of sampling continuous-time signals. This framework can be naturally extended to the case where the input is a discrete-time signal. Here we present a novel approach which uses both the traditional FRI sampling scheme, based on the annihilating filter method, and the fact that in this new setup the null space of the problem to be solved is finite dimensional.

In the noiseless scenario, we show that this new approach is able to perfectly recover the original signal at the critical sampling rate. We also present simulation results in the noisy scenario where this new approach improves performances in terms of the mean squared error (MSE) of the reconstructed signal when compared to the canonical FRI algorithms and compressed sensing (CS).

finite_fri.pdf
S. H. Chan, T. Zickler, and Y. M. Lu, “Monte Carlo Non-Local Means: Random Sampling for Large-Scale Image Filtering,” IEEE Transactions on Image Processing, vol. 23, no. 8, pp. 3711-3725, 2014.Abstract

We propose a randomized version of the non-local means (NLM) algorithm for large-scale image filtering. The new algorithm, called Monte Carlo non-local means (MCNLM), speeds up the classical NLM by computing a small subset of image patch distances, which are randomly selected according to a designed sampling pattern. We make two contributions. First, we analyze the performance of the MCNLM algorithm and show that, for large images or large external image databases, the random outcomes of MCNLM are tightly concentrated around the deterministic full NLM result. In particular, our error probability bounds show that, at any given sampling ratio, the probability for MCNLM to have a large deviation from the original NLM solution decays exponentially as the size of the image or database grows. Second, we derive explicit formulas for optimal sampling patterns that minimize the error probability bound by exploiting partial knowledge of the pairwise similarity weights. Numerical experiments show that MCNLM is competitive with other state-of-the-art fast NLM algorithms for single-image denoising. When applied to denoising images using an external database containing ten billion patches, MCNLM returns a randomized solution that is within 0.2 dB of the full NLM solution while reducing the runtime by three orders of magnitude.

mcnlm.pdf

Pages