Deanna
Needell (compressed sensing, randomized
algorithms, functional analysis, computational
mathematics, probability, statistics)
Dimension Reduction -- prerequisites: Math 60, Math
151 or 052 recommended
Is a fast MRI possible? Data
in the modern world like the measurements that create an MRI
image is so voluminous that storing, viewing, and analyzing
it is often extremely difficult both physically and
computationally. Dimension reduction addresses this issue by
projecting high dimensional data onto a low dimensional
space in such a a way that the geometry of the data is
preserved. This technology can be used in digital
photography, medical imaging, computational biology and many
more applications. In this project, we will harness the
power of randomized dimension reduction to accelerate
methods in applications which require high computational
costs.
Randomized
Numerical Linear Algebra -- prerequisites: Math 60, Math 151
Overdetermined systems of linear equations appear in many
applications ranging from computer tomography (CT) to
quantization. Often, the systems are so large that
they cannot even be loaded into memory. Iterative
methods have been developed to solve such systems by
accessing only row of the matrix at a time, making them
efficient in practice. The deterministic versions of
such algorithms suffer from slow convergence in some
cases. However, when the methods are randomized,
convergence is provably exponential in expectation.
Although it is well known that randomization is extremely
useful, it remains an open question to identify which
choices of random distributions yield the best
convergence. In this project, students would explore
the type of random distributions that lead to the best
performance. This can be studied both experimentally
and analytically.
Clustering -- prerequisites: Math 60
Clustering methods address
the problem of identifying similar observations within a
large set of data. For example, from a large
collection of medical scans and patient information, can one
automatically detect a group of sick patients from the group
of healthy? From an employee database, can one
identify employess who are likely to leave the company from
those who will stay? These are just several examples
of the many applications of clustering. Spectral
clustering methods can be used to solve such problems using
linear algebraic techniques. Students interested in
this project can test and analyze spectral clustering
methods in a particular application of interest, and/or
analyze and improve methods which speed up the clustering
process. The latter is an especially important issue
for large datasets since traditional spectral clustering
methods run in cubic time.
Principal Component Analysis --
prerequisites: Math 60, programming experience recommended
Principal Component
Analysis (PCA) is one of the most widely used statistical
tools for data analysis with applications in data
compression, image processing, and bioinformatics. As the
name suggests, given a large number of variables and
observations, PCA detects the variables which explain most
of the data, called principal components. Mathematically,
given any data matrix, classical PCA finds the best
low-rank approximation to that matrix by solving an
optimization problem. This problem can be solved using the
singular value decomposition (SVD), and still performs
well when the matrix is corrupted by small Gaussian
errors. However, PCA is extremely sensitive to errors of
large magnitude, even when the number of errors in the
data is small. This poses a serious problem since in most
recent applications large errors commonly occur due to
malicious users of a system, failure of equipment, and
natural artifacts. Interested students should be
comfortable with linear algebra. Students who participate
in this project would learn various topics in
computational analysis, probability, and statistics, with
the goal of using and improving upon new robust PCA
methods.