Liang, Percy

last modified on Sun, 07 Jun 2026

		Random Link ¯\_(ツ)_/¯
Oct 4, 2021	»	Journal Reviews on Fairness 8 min; updated Jun 7, 2026 Meta Instead of changing the data or learners in multiple ways and then see if fairness improves, postulate that the root causes of bias are the prior decisions that generated the training data. These affect (a) what data was selected, and (b) the labels assigned to the examples. They propose the \(\text{Fair-SMOTE}\) (Fair Synthetic Minority Over Sampling Technique) algorithm which (1) removes biased labels (via situation testing: if the model’s prediction for a data point changes once all of the data points' protected attributes are flipped, then that label is biased and the data point is discarded), and (2) rebalances internal distributions such that based on a protected attribute, examples are equal in both positive and negative classes. The method is just as effective in reducing bias as prior approaches, and its models achieve higher recall and F1 performance. Furthermore, \(\text{Fair-SMOTE}\) can simultaneously reduce bias for more than one protected attribute. ...
Dec 14, 2023	»	[ToDo] CS 324: Large Language Models 1 min; updated Nov 30, 2025 Lectures \| CS324. Percy Liang; Tatsunori Hashimoto; Christopher Ré. stanford-cs324.github.io . 2022. Accessed Dec 14, 2023. ✅ Introduction: What is an LM? A brief history . Why does CS 324 exist? Capabilities: What are the capabilities of GPT-3? Harms I & II: Performance disparities, social biases and stereotypes, toxicity, and misinformation. Data: Data behind LLMs; documentation of datasets; data ecosystems. Security & Privacy: Security implications of LLMs; data poisoning; privacy risks and opportunities. Legality: The law on development and deployment of LLMs; distinction between law and ethics. Modeling: Tokenization; model architecture. Training: Objective functions; optimization algorithms. Parallelism: Key goal of hardware, systems, and for more than a decade, the only way to get performance. Scaling Laws: Motivating problem: hyper-parameter costs. Selective Architectures: Raising the ceiling of how big the models can get. Adaptation: Why adapt the LM? Probing; fine-tuning; lightweight fine-tuning. Environmental Impact: What is the environmental impact of LLMs?
Nov 30, 2025	»	Given Language Models, Why Learn About Large Language Models? 4 min; updated Nov 30, 2025 This part of seems pertinent to respond to “LLMs are just (auto-complete; Markov chains; [insert pre-existing LM-adjacent tech]) on steroids”. Scale LLMs are massive. From 2018 - 2022, model sizes have increased 5000x. OpenAI’s GPT model from June 2018 had 110M parameters; GPT-3 from May 2020 had 175B parameters. LLM providers no longer seem to advertise their parameter counts; GPT-4 was leaked to have 1.8T parameters. LLMs as Standalone Systems Unlike LMs that were used as components of larger systems, e.g., machine translation, LLMs are increasingly capable of being a standalone system. Recall that LMs are capable of conditional generation (given a prompt, generate a completion). This allows the same LLM to solve a variety of tasks by changing the prompt, e.g., ...
Dec 14, 2023	»	Introduction to LLMs 4 min; updated Dec 17, 2023 What is a Language Model? A language model (LM) is a probability distribution over sequences of tokens. Suppose we have a vocabulary \(\mathcal{V}\) of a set of tokens, then a language model \(p\) assigns each sequence of tokens \(x_1, …, x_L \in \mathcal{V} \) a probability. To assign meaningful probabilities to all sequences requires syntactic knowledge and world knowledge. Given \( \mathcal{V} = \{ \text{ate}, \text{ball}, \text{cheese}, \text{mouse}, \text{the} \} \): ...