AMSC 808N/CMSC828V: Numerical Methods for Data Science and Machine Learning
Fall 2020
Instructor: Maria Cameron
A brief description: Optimization (fundamentals of constrained and unconstrained optimization, algorithms for large-scale problems, Tikhonov and lasso regularization). Matrix data and latent factor models (Ky-Fan norms, nonlinear matrix factorization, CUR decomposition, applications). Dimensionality reduction for data visualization and organization (PCA, MDS, isomap, LLE, t-SNE, diffusion maps). Graph data analysis (basic graph algorithms (DFS and BFS), random graph models, site and edge percolation, mining large graphs).
Expectations: The students are expected to have solid knowledge of linear algebra and multivariable calculus and be able to program.
Coursework:
- Lectures
- Homework (theoretical excercises) 30%
- Group projects (programming, real data analysis, benchmark examples) 35%
- Final exam 35%
- What is data science?
- Review of linear algebra
- 2. Optimization (4.5 weeks)
-
Classification problems. Basics of constrained optimization. The KKT optimality conditions.
The active-set method for constrained minimization problem with linear constraints.
SVM: soft margins, duality, implementation, numerical issues.
-
Unconstrained minimization problem for DNNs, an overview of methods for unconstrained optimization.
Convergence and properties of gradient descend, gradient descend with errors, a motivation for stochastic gradient descend.
Stochastic gradient descend: assumptions, lemmas, convergence theorem for fixed stepsize.
Stochastic gradient descend with decreasing stepsizes, convergence theorem.
-
Subsampled inexact Newton’s method.
Features and components of second-order methods: scale-invariance, conjugate gradient method for obtaining search direction, backtracking line search. BFGS, L-BFGS, stochastic L-BFGS.
-
Gauss-Newton and Levenberg-Marquardt methods for solving nonlinear least-squares problems.
Solving a BVP for the Poisson PDE in 2D by means of NNs.
-
Geometry of linear least squares problems.
Tikhonov and Lasso regularization, coordinate descend.
- 3. Matrix Data and latent factor models (3 weeks)
-
Examples: Latent Semantic Analysis and k-means clustering algorithm, eigenfaces, collaborative filtering.
-
Ky-Fan norms. Eckart-Young-Mirsky theorem. NMF.
Methods for computing NMF: projected gradient descend, Lee-Seung, coordinate descend (one entry at-a-time, hierarchical alternating least squares (HALS), alternative nonnegative least squares (ANLS)).
-
Collaborative filtering and matrix completion. Nuclear norm.
-
CUR matrix decomposition.
- 4. Nonlinear dimersionality reduction (3 weeks)
-
Linear and Nonlinear dimensionality reduction.
-
Linear methods: PCA, MDS.
-
Isomap.
-
Locally linear embedding (LLE).
-
Student’s t-distribution stochastic neighbor embedding (t-SNE).
-
Diffusion maps.
Diffusion maps and Laplacian eigenmaps. Diffusion maps with renormalization parameter alpha.
- 5. Numerical methods for graph data analysis (3.5 weeks)
-
Basic graph definitions.
-
Breadth-first search. Depth-first search.
-
Poisson random graphs. Generating functions.
Random graphs with specified degree distribution.
-
Scale-free random networks.
Growth and preferential attachment. Vulnerability to random failures and targeted attacks.
-
Site percolation in random graphs with power-law degree distribution
SIR model on random graphs. Power-law degree distribution case study.
-
Mining large graphs. Brin’s and Page’s PageRank, that ranks pages according to their importance.
Take-home final exams:
Some key references:
- David Bindel, Cornell University, http://www.cs.cornell.edu/~bindel/class/sjtu-summer19/index.html
- Austin, Benson, Cornell University, http://www.cs.cornell.edu/courses/cs6241/2019sp/
- https://epubs.siam.org/doi/abs/10.1137/16M1080173
https://leon.bottou.org/publications/pdf/tr-optml-2016.pdf L. Bottow, F. Curtis, J,. Nocedal, Optimization methods for large-scale machine learning, SIAM Review, Vol. 60, #2, pp. 223—311
- https://pdfs.semanticscholar.org/e6c4/925fb114d13a8568f88957c167c928f0c9f1.pdf?_ga=2.161164477.1061462531.1568569080-1140510733.1567996867 M. E. Newman, The Structure and Function of Complex Networks, SIAM Review, Vol. 45, #2, pp. 167—256
- http://networksciencebook.com/chapter/1#networks A.-L. Barabasi, Network Science