| Random Link ¯\_(ツ)_/¯ | ||
| Dec 14, 2023 | » | [ToDo] CS 324: Large Language Models
1 min; updated Nov 30, 2025
Lectures | CS324.
Percy Liang; Tatsunori Hashimoto; Christopher Ré.
|
| Nov 30, 2025 | » | Given Language Models, Why Learn About Large Language Models?
4 min; updated Nov 30, 2025
ScaleLLMs are massive. From 2018 - 2022, model sizes have increased 5000x. OpenAI’s GPT model from June 2018 had 110M parameters; GPT-3 from May 2020 had 175B parameters. LLM providers no longer seem to advertise their parameter counts; GPT-4 was leaked to have 1.8T parameters. LLMs as Standalone SystemsUnlike LMs that were used as components of larger systems, e.g., machine translation, LLMs are increasingly capable of being a standalone system. Recall that LMs are capable of conditional generation (given a prompt, generate a completion). This allows the same LLM to solve a variety of tasks by changing the prompt, e.g., ... |
| Dec 14, 2023 | » | Introduction to LLMs
4 min; updated Dec 17, 2023
What is a Language Model?A language model (LM) is a probability distribution over sequences of tokens. Suppose we have a vocabulary \(\mathcal{V}\) of a set of tokens, then a language model \(p\) assigns each sequence of tokens \(x_1, …, x_L \in \mathcal{V} \) a probability. To assign meaningful probabilities to all sequences requires syntactic knowledge and world knowledge. Given \( \mathcal{V} = \{ \text{ate}, \text{ball}, \text{cheese}, \text{mouse}, \text{the} \} \): ... |
This part of seems pertinent to respond to “LLMs are just (auto-complete; Markov chains; [insert pre-existing LM-adjacent tech]) on steroids”.