| Random Link ¯\_(ツ)_/¯ | ||
| Dec 14, 2023 | » | [ToDo] CS 324: Large Language Models
1 min; updated Nov 30, 2025
Lectures | CS324. Percy Liang; Tatsunori Hashimoto; Christopher Ré. stanford-cs324.github.io . 2022. Accessed Dec 14, 2023. ✅ Introduction: What is an LM? A brief history . Why does CS 324 exist? Capabilities: What are the capabilities of GPT-3? Harms I & II: Performance disparities, social biases and stereotypes, toxicity, and misinformation. Data: Data behind LLMs; documentation of datasets; data ecosystems.... |
| Nov 30, 2025 | » | Given Language Models, Why Learn About Large Language Models?
4 min; updated Nov 30, 2025
This part of seems pertinent to respond to “LLMs are just (auto-complete; Markov chains; [insert pre-existing LM-adjacent tech]) on steroids”. Scale LLMs are massive. From 2018 - 2022, model sizes have increased 5000x. OpenAI’s GPT model from June 2018 had 110M parameters; GPT-3 from May 2020 had 175B parameters. LLM providers no longer seem to advertise their parameter counts; GPT-4 was leaked to have 1.8T parameters.... |
| Dec 14, 2023 | » | Introduction to LLMs
4 min; updated Dec 17, 2023
What is a Language Model? A language model (LM) is a probability distribution over sequences of tokens. Suppose we have a vocabulary \(\mathcal{V}\) of a set of tokens, then a language model \(p\) assigns each sequence of tokens \(x_1, …, x_L \in \mathcal{V} \) a probability. To assign meaningful probabilities to all sequences requires syntactic knowledge and world knowledge. Given \( \mathcal{V} = \{ \text{ate}, \text{ball}, \text{cheese}, \text{mouse}, \text{the} \} \):... |