diff --git a/knowledge base/ai/lms.md b/knowledge base/ai/lms.md
index ac95542..ea0b108 100644
--- a/knowledge base/ai/lms.md
+++ b/knowledge base/ai/lms.md
@@ -1,9 +1,7 @@
# Language models
-_Language models_ are **statistical** models designed to understand, generate, and predict sequences of words in natural
-language.
-They analyze the structure and use of language to perform tasks such as machine translation, text generation, and
-sentiment analysis.
+Statistical or machine learning models designed to understand, generate, and predict the next token in a sequence given
+the previous ones.
1. [TL;DR](#tldr)
1. [Large Language Models](#large-language-models)
@@ -19,7 +17,34 @@ sentiment analysis.
## TL;DR
-| FIXME | Creator |
+_Tokens_ can be words, subwords (one or more subsets of a word), or single characters.
+The full sequence of tokens can be an entire sentence, paragraph, or an entire essay.
+
+LMs are proficient at understanding human prompts in natural language.
+They analyze the structure and use of natural language, enabling machines to process and generate text that is
+contextually appropriate and coherent.
+
+Their primary purpose is to capture the **statistical** properties of natural language in mathematical notation.
+They can predict the **likelihood** that a given token will follow a sequence of other tokens by learning the
+probability distribution of patterns.
+This predictive capability is fundamental for tasks that require understanding the context and meaning of text, and it
+can be extended to more complex tasks.
+
+_Context_ is helpful information before or after a target token.
+It can help a language model make better predictions, like determining whether "orange" refers to a citrus fruit or a
+color.
+
+_Large_ LMs are language models trained on massive datasets, and encoding their acquired knowledge into up to trillions
+of parameters.
+
+_Parameters_ are internal weights and values that an LLM learns during training.
+They are used to capture patterns in language such as grammar, meaning, context and relationships between words.
+
+The more parameters a model has, the better it typically is to understand and generate complex output.
+An increased parameter count, on the other hand, demands more computational resources for training and inference, and
+make models more prone to overfitting, slower to respond, and harder to deploy efficiently.
+
+| Provider | Creator |
| --------- | ---------- |
| [ChatGPT] | OpenAI |
| [Claude] | Anthropic |
@@ -30,10 +55,9 @@ sentiment analysis.
| [Llama] | Meta |
| [Mistral] | Mistral AI |
-LLms are good at understanding human prompts in natural language.
-
-Many models now come pre-trained, and one can use the same model for classification, summarisation, answering questions,
-data extraction, generation, reasoning, planning, translation, coding, and more.
+Many models now come pre-trained, and one can use the same model for different language-related purposes like
+classification, summarisation, answering questions, data extraction, text generation, reasoning, planning, translation,
+coding, sentiment analysis, speech recognition, and more.
They can be also be further trained on additional information specific to an industry niche or a particular business.