Files
oam/knowledge base/ai/large language model.md
2026-02-13 00:44:18 +01:00

4.7 KiB

Large language model

Language models are computational model that can predict sequences in natural language.
Useful for speech recognition, machine translation, natural language generation, optical character recognition, route optimization, handwriting recognition, grammar induction, information retrieval, and other tasks.

Large language models are predominantly based on transformers trained on large datasets, frequently including texts scraped from the Internet.
They have superseded recurrent neural network-based models.

  1. TL;DR
  2. Reasoning
  3. Concerns
  4. Run LLMs Locally
  5. Further readings
    1. Sources

TL;DR

FIXME Creator
ChatGPT OpenAI
Claude Anthropic
Copilot Microsoft
Duck AI DuckDuckGo
Gemini Google
Grok X
Llama Meta
Mistral Mistral AI

Reasoning

Standard is just autocompletion. Models just try to infer or recall what the most probable next word would be.

Chain of Thought tells models to show their work. It feels like the model is calculating or thinking.
What it really does is just increasing the chances that the answer is correct by breaking the user's questions in smaller, more manageable steps, and solving on each of them before giving back the final answer.
The result is more accurate, but it costs more tokens and requires a bigger context window.

At some point we gave models the ability to execute commands. This way the model can use (or even create) them to get or check the answer, instead of just infer or recall it.

The ReAct loop (reason+act) came next, where the model loops on the things above. Breaks the request in smaller steps, acts on them using functions if necessary, checks the results, updates the chain of thoughts, repeat until the request is satisfied.

Next step is agentic AI.

Concerns

  • Lots of people currently thinks of LLMs as real intelligence, when it is not.
  • People currently gives too much credibility to LLM answers, and trust them more than they trust their teachers, accountants, lawyers or even doctors.
  • AI companies could bias their models to say specific things, subtly promote ideologies, influence elections, or even rewrite history in the mind of those who trust the LLMs.
  • Models can be vulnerable to specific attacks (e.g. prompt injection) that would change the LLM's behaviour, bias it, or hide malware in their tools.
  • People is using LLMs mindlessly too much, mostly due to the convenience they offer but also because they don't understand what those are or how they work. This is causing lack of critical thinking and overreliance.
  • Model training and execution requires resources that are normally not available to the common person. This encourages people to depend from, and hence give power to, AI companies.

Run LLMs Locally

Refer:

Ollama| Jan |LMStudio | Docker model runner | llama.cpp | vLLM | Llamafile

Further readings

Sources