| Random Link ¯\_(ツ)_/¯ | ||
| Nov 30, 2025 | » | Given Language Models, Why Learn About Large Language Models?
4 min; updated Nov 30, 2025
This part of seems pertinent to respond to “LLMs are just (auto-complete; Markov chains; [insert pre-existing LM-adjacent tech]) on steroids”. Scale LLMs are massive. From 2018 - 2022, model sizes have increased 5000x. OpenAI’s GPT model from June 2018 had 110M parameters; GPT-3 from May 2020 had 175B parameters. LLM providers no longer seem to advertise their parameter counts; GPT-4 was leaked to have 1.8T parameters.... |
| Dec 14, 2023 | » | Introduction to LLMs
4 min; updated Dec 17, 2023
What is a Language Model? A language model (LM) is a probability distribution over sequences of tokens. Suppose we have a vocabulary \(\mathcal{V}\) of a set of tokens, then a language model \(p\) assigns each sequence of tokens \(x_1, …, x_L \in \mathcal{V} \) a probability. To assign meaningful probabilities to all sequences requires syntactic knowledge and world knowledge. Given \( \mathcal{V} = \{ \text{ate}, \text{ball}, \text{cheese}, \text{mouse}, \text{the} \} \):... |