[ToDo] CS 168: The Modern Algorithmic Toolbox

Dated Sep 10, 2022; last modified on Sun, 11 Sep 2022

The Modern Algorithmic Toolbox (CS168). Gregory Valiant. web.stanford.edu . 2022. Accessed Sep 10, 2022.

Modern Hashing

  • Consistent hashing.
  • Property-preserving lossy compression. From majority elements to approximate heavy hitters. From bloom filters to the count-min sketch.

Data with Distances

  • Similarity Search. (Dis)similarity metrics: Jaccard, Euclidean, Lp. Efficient algorithm for finding similar elements in small/medium (i.e. \(< 20\)) dimensions using k-d trees.
  • Curse of Dimensionality, kissing number. Distance-preserving compression. Estimating Jaccard Similarity using MinHash. Euclidean distance preserving dimensionality reduction (aka the Johnson-Lindenstrauss Transform)

Generalization and Regularization

  • Generalization (or, how much data is enough?). Learning an unknown function from samples from an unknown distribution. Training error vs. test error. PAC guarantees for linear classifiers. Empirical risk minimization.
  • Regularization. The polynomial embedding and random projection, L2 regularization, and L1 regularization as a computationally tractable surrogate for L0 regularization.

Linear-Algebraic Techniques: Understanding Principal Components Analysis (PCA)

  • Understanding PCA. Minimizing squared distances equals minimizing variance. Use case for data visualization and data compression. Failure modes for PCA.
  • How PCA works. Maximizing variance as finding the “direction of maximum stretch” of the covariance matrix. The simple geometry of “diagonals in disguise.” The power iteration algorithm.

Linear-Algebraic Techniques: Understanding the Singular Value Decomposition (SVD)

  • Low-rank matrix approximations. SVD; applications to matrix compression, de-noising, and matrix completion (i.e. recovering missing entries).
  • Tensor methods. Differences between matrices and tensors, the uniqueness of low-rank tensor factorizations, and Jenrich’s algorithm.

Spectral Graph theory

  • Graphs as matrices and the Laplacian of a graph, Interpretations of the largest and smallest eigenvectors/eigenvalues of the Laplacian. Spectral embeddings, and an overview of applications (e.g. graph coloring, spectral clustering).
  • Interpretations of the second eigenvalue (via conductance and isoperimetric number), and the connections with the speed at which random walks/diffusions converge.

Sampling and Estimation

  • Reservoir sampling (how to select a random sample from a data stream). Basic probability tools: Markov’s inequality and Chebyshev’s inequality. Importance Sampling (how to make inferences about one distribution based on samples from a different distribution). Description of the Good-Turing estimate of missing/unseen mass.
  • Markov Chains, stationary distributions. Markov Chain Monte Carlo (MCMC) as approaches to solving hard problems by sampling from carefully crafted distributions.

The Fourier Perspective (and other bases)

  • Fourier methods, part 1.
  • Fourier methods, part 2 (emphasis on convolutions).

Sparse Vector/Matrix Recovery (Compressive Sensing) and Mathematical Programming

  • Compressive sensing.
  • Linear and convex programming. Matrix completion.

Privacy Preserving Computation

  • Differential Privacy in data analysis and machine learning