From 34d35dadec8b3c74b01bee83f6b42969786b9b17 Mon Sep 17 00:00:00 2001 From: Michele Cereda Date: Sat, 21 Feb 2026 00:03:39 +0100 Subject: [PATCH] chore(kb/ai): review and expand notes --- knowledge base/ai/lms.md | 64 +++++++++++++++++++++++++++++++--------- 1 file changed, 50 insertions(+), 14 deletions(-) diff --git a/knowledge base/ai/lms.md b/knowledge base/ai/lms.md index ac95542..ea0b108 100644 --- a/knowledge base/ai/lms.md +++ b/knowledge base/ai/lms.md @@ -1,9 +1,7 @@ # Language models -_Language models_ are **statistical** models designed to understand, generate, and predict sequences of words in natural -language.
-They analyze the structure and use of language to perform tasks such as machine translation, text generation, and -sentiment analysis. +Statistical or machine learning models designed to understand, generate, and predict the next token in a sequence given +the previous ones. 1. [TL;DR](#tldr) 1. [Large Language Models](#large-language-models) @@ -19,7 +17,34 @@ sentiment analysis. ## TL;DR -| FIXME | Creator | +_Tokens_ can be words, subwords (one or more subsets of a word), or single characters.
+The full sequence of tokens can be an entire sentence, paragraph, or an entire essay. + +LMs are proficient at understanding human prompts in natural language.
+They analyze the structure and use of natural language, enabling machines to process and generate text that is +contextually appropriate and coherent. + +Their primary purpose is to capture the **statistical** properties of natural language in mathematical notation.
+They can predict the **likelihood** that a given token will follow a sequence of other tokens by learning the +probability distribution of patterns.
+This predictive capability is fundamental for tasks that require understanding the context and meaning of text, and it +can be extended to more complex tasks. + +_Context_ is helpful information before or after a target token.
+It can help a language model make better predictions, like determining whether "orange" refers to a citrus fruit or a +color. + +_Large_ LMs are language models trained on massive datasets, and encoding their acquired knowledge into up to trillions +of parameters. + +_Parameters_ are internal weights and values that an LLM learns during training.
+They are used to capture patterns in language such as grammar, meaning, context and relationships between words. + +The more parameters a model has, the better it typically is to understand and generate complex output.
+An increased parameter count, on the other hand, demands more computational resources for training and inference, and +make models more prone to overfitting, slower to respond, and harder to deploy efficiently. + +| Provider | Creator | | --------- | ---------- | | [ChatGPT] | OpenAI | | [Claude] | Anthropic | @@ -30,10 +55,9 @@ sentiment analysis. | [Llama] | Meta | | [Mistral] | Mistral AI | -LLms are good at understanding human prompts in natural language. - -Many models now come pre-trained, and one can use the same model for classification, summarisation, answering questions, -data extraction, generation, reasoning, planning, translation, coding, and more.
+Many models now come pre-trained, and one can use the same model for different language-related purposes like +classification, summarisation, answering questions, data extraction, text generation, reasoning, planning, translation, +coding, sentiment analysis, speech recognition, and more.
They can be also be further trained on additional information specific to an industry niche or a particular business.