Publications by Year: 2018

C. Wang, Y. C. Eldar, and Y. M. Lu, “Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis,” IEEE Journal of Selected Topics in Signal Processing, vol. 12, no. 6, 2018. arXiv:1805.06834 [cs.LG]Abstract
We present a high-dimensional analysis of three popular algorithms, namely, Oja's method, GROUSE and PETRELS, for subspace estimation from streaming and highly incomplete observations.  We show that, with proper time scaling, the time-varying principal angles between the true subspace and its estimates given by the algorithms converge weakly to deterministic processes when the ambient dimension \(n\) tends to infinity. Moreover, the limiting processes can be exactly characterized as the unique solutions of certain ordinary differential equations (ODEs). A finite sample bound is also given, showing that the rate of convergence towards such limits is \(\mathcal{O}(1/\sqrt{n})\). In addition to providing asymptotically exact predictions of the dynamic performance of the algorithms, our high-dimensional analysis yields several insights, including an asymptotic equivalence between Oja's method and GROUSE, and a precise scaling relationship linking the amount of missing data to the signal-to-noise ratio. By analyzing the solutions of the limiting ODEs, we also establish phase transition phenomena associated with the steady-state performance of these techniques.
L. Balzano, Y. Chi, and Y. M. Lu, “A Modern Perspective on Streaming PCA and Subspace Tracking: The Missing Data Case,” Proceedings of the IEEE, vol. 106, no. 8, pp. 1293-1310, 2018.Abstract
For many modern applications in science and engineering, data are collected in a streaming fashion carrying time-varying information, and practitioners need to process them with a limited amount of memory and computational resources in a timely manner for decision making. This often is coupled with the missing data problem, such that only a small fraction of data attributes are observed. These complications impose significant, and unconventional, constraints on the problem of streaming Principal Component Analysis (PCA) and subspace tracking, which is an essential building block for many inference  tasks in signal processing and machine learning. This survey article reviews a variety of classical and recent algorithms for solving this problem with low computational and memory complexities, particularly those applicable in the big data regime with missing data. We illustrate that streaming PCA and subspace tracking algorithms can be understood through algebraic and geometric perspectives and they need to be adjusted carefully to handle missing data. Both asymptotic and non-asymptotic convergence guarantees are reviewed. Finally, we benchmark the performance of several competitive algorithms in the presence of missing data for both well-conditioned and ill-conditioned systems.
Y. M. Lu, J. Oñativia, and P. L. Dragotti, “Sparse Representation in Fourier and Local Bases Using ProSparse: A Probabilistic Analysis,” IEEE Transactions on Information Theory, vol. 64, no. 4, pp. 2639-2647, 2018. arXiv:1611.07971 [cs.IT]Abstract

Finding the sparse representation of a signal in an overcomplete dictionary has attracted a lot of attention over the past years. This paper studies ProSparse, a new polynomial complexity algorithm that solves the sparse representation problem when the underlying dictionary is the union of a Vandermonde matrix and a banded matrix. Unlike our previous work which establishes deterministic (worst-case) sparsity bounds for ProSparse to succeed, this paper presents a probabilistic average-case analysis of the algorithm. Based on a generating-function approach, closed-form expressions for the exact success probabilities of ProSparse are given. The success probabilities are also analyzed in the high-dimensional regime. This asymptotic analysis characterizes a sharp phase transition phenomenon regarding the performance of the algorithm.