diff --git a/knowledge base/ai/lms.md b/knowledge base/ai/lms.md
index ea0b108..43eef33 100644
--- a/knowledge base/ai/lms.md
+++ b/knowledge base/ai/lms.md
@@ -100,7 +100,7 @@ recognition, machine translation, natural language generation, optical character
handwriting recognition, grammar induction, information retrieval, and other tasks.
They are currently predominantly based on _transformers_, which have superseded recurrent neural networks as the most
-effective technology.
+effective architecture.
Training LLMs involves feeding them vast amounts of data, and computing weights to optimize their parameters.
The training process typically includes multiple stages, and requires substantial computational resources.
@@ -163,11 +163,11 @@ For domain-specific applications, consider fine-tuning a small model to mimic th
Standard models' behaviour is just autocompletion. Models just try to infer or recall what the most probable next word
would be.
-_Chain of Thought_ techniques tell models to _show their work_.
+_Chain of Thought_ techniques tell models to _show their work_ by breaking prompts in smaller, more manageable steps,
+and solving on each of them singularly before giving back the final answer.
+The result is more accurate, but it costs more tokens and requires a bigger context window.
It _feels_ like a model is calculating or thinking, but what it is really just increasing the chances that the answer
-is correct by breaking questions in smaller, more manageable steps, and solving on each of them before giving back the
-final answer.
-The result is more accurate, but it costs more tokens and requires a bigger context window.
+is logically sound.
The _ReAct loop_ (Reason + Act) paradigm forces models to loop over chain-of-thoughts.
A model breaks the request in smaller steps, plans the next action, acts on it using [functions][function calling]
@@ -207,28 +207,37 @@ Deciding which tool to call, using that tool, and then using the results to gene
just inferring the next token.
> [!caution]
-> Allowing a LLM to call functions can have real-world consequences.
+> Allowing LLMs to call functions can have real-world consequences.
> This includes financial loss, data corruption or exfiltration, and security breaches.
## Concerns
-- Training requires massive amounts of resource and hence consumes a vast amount of energy and cooling.
-- Lots of people currently thinks of LLMs as _real intelligence_, when it is not.
-- People currently gives too much credibility to LLM answers, and trust them more than they trust their teachers,
- accountants, lawyers or even doctors.
-- AI companies could bias their models to say specific things, subtly promote ideologies, influence elections, or even
- rewrite history in the mind of those who trust the LLMs.
-- Models can be vulnerable to specific attacks (e.g. prompt injection) that would change the LLM's behaviour, bias it,
- or hide malware in their tools.
-- People is using LLMs mindlessly too much, mostly due to the convenience they offer but also because they don't
- understand what those are or how they work. This is causing lack of critical thinking and overreliance.
-- Model training and execution requires resources that are normally not available to the common person. This encourages
- people to depend from, and hence give power to, AI companies.
-- Models tend to **not** accept gracefully that they don't know something, and hallucinate as a result.
- More recent techniques are making models more efficient, but they just delay this problem.
-- Models can learn and exhibit deceptive behavior.
- Standard techniques could fail to remove it, and instead empower it while creating a false impression of safety.
+- Lots of people currently thinks of LLMs as _real, rational, intelligence_, when they are not.
+ LLMs are really nothing more than glorified **guessing machines** that are _designed_ to interact naturally. It's
+ humans that are biased by evolution toward _attributing_ sentience and agency to entities they interact with.
+- People is mindlessly using LLMs too much, mostly due to the convenience they offer but also because they don't
+ understand what those are or how they work. This is causing lack of critical thinking, and overreliance.
+- People is giving too much credibility to LLM answers, and trust them more than they trust their teachers, accountants,
+ lawyers or even doctors.
+- LLMs are **incapable** of distinguishing facts from beliefs, and are completely disembodied from the world.
+ They do not _understand_ concepts and are unaware of time, change, and causality. They just **approximate** reasoning
+ by _mimicking_ language based on how connected are the tokens in their own training data.
+- Models are very limited in their ability to revise beliefs. Once some pattern is learned, it is extremely difficult to
+ unwire it due to the very nature of how models function.
+- AI companies could steer and bias their models to say specific things, subtly promote ideologies, influence elections,
+ or even rewrite history in the mind of those who trust the LLM.
+- Models can be vulnerable to attacks (e.g. prompt injection) that can change the LLM's behaviour, bias it, or hide
+ malware in the tools they manage and use.
+- Model training and execution requires massive amounts of data and computation, resources that are normally **not**
+ available to the common person. Aside from the vast amount of energy and cooling they consume, this encourages people
+ to depend from, and hence give power to, AI companies.
+- Models _can_ learn and exhibit deceptive behavior.
+ Standard revision techniques could fail to remove it, and instead empower it while creating a false impression of
+ safety.
See [Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training].
+- Models are painfully inconsistent, often unaware of their limitations, irritatingly overconfident, and tend to **not**
+ accept gracefully that they don't know something, ending up preferring to hallucinate as the result.
+ More recent techniques are making models more efficient, but they just delay this problem.
## Run LLMs Locally
@@ -257,6 +266,7 @@ Refer:
- [Introduction to Large Language Models]
- GeeksForGeeks' [What are LLM parameters?][geeksforgeeks / what are llm parameters?]
- IBM's [What are LLM parameters?][ibm / what are llm parameters?]
+- [This is not the AI we were promised], presentation by Michael John Wooldridge at the Royal Society
+
+1. [TL;DR](#tldr)
+1. [Further readings](#further-readings)
+ 1. [Sources](#sources)
+
+## TL;DR
+
+Plugin that enables vLLM to run on Apple Silicon Macs using MLX as the primary compute backend, enabling higher
+performances.
+
+
+ Setup
+
+> [!important]
+> Use Python v3.10 to v3.12 as per 2026-02-21.
+> Python 3.13 is not yet supported.
+
+```sh
+# Install from sources.
+git clone 'https://github.com/vllm-project/vllm-metal.git' \
+&& cd 'vllm-metal' \
+&& pip install -e '.' 'https://github.com/vllm-project/vllm/releases/download/v0.15.1/vllm-0.15.1.tar.gz'
+
+# Use the provided installation script.
+curl -fsSL 'https://raw.githubusercontent.com/vllm-project/vllm-metal/main/install.sh' | bash
+```
+
+
+
+Refer [vLLM] for usage.
+
+## Further readings
+
+- [vLLM]
+- [Codebase]
+
+### Sources
+
+
+
+
+
+[vLLM]: vllm.md
+
+
+
+[Codebase]: https://github.com/vllm-project/vllm-metal
+
+
diff --git a/knowledge base/ai/vllm.md b/knowledge base/ai/vllm.md
index 559088e..f9057fe 100644
--- a/knowledge base/ai/vllm.md
+++ b/knowledge base/ai/vllm.md
@@ -31,8 +31,8 @@ capabilities, and enterprise-scale LLM serving.
Setup
-Prefer using [vllm-project/vllm-metal] on Apple silicon.
-Install with `curl -fsSL 'https://raw.githubusercontent.com/vllm-project/vllm-metal/main/install.sh' | bash`
+> [!tip]
+> Prefer using [vLLM-metal] on Apple silicon.
```sh
pip install 'vllm'
@@ -51,6 +51,7 @@ vllm --help
# Start the vLLM OpenAI Compatible API server.
vllm serve 'meta-llama/Llama-2-7b-hf'
+vllm serve '/path/to/local/model'
vllm serve … --port '8000' --gpu-memory-utilization '0.9'
vllm serve … --tensor-parallel-size '2' --uds '/tmp/vllm.sock'
@@ -79,15 +80,23 @@ vllm run-batch --model 'meta-llama/Meta-Llama-3-8B-Instruct' -o 'results.jsonl'
-
## Further readings
@@ -111,7 +120,7 @@ vllm run-batch --model 'meta-llama/Meta-Llama-3-8B-Instruct' -o 'results.jsonl'
[Blog]: https://blog.vllm.ai/
[Codebase]: https://github.com/vllm-project/
[Documentation]: https://docs.vllm.ai/en/
-[vllm-project/vllm-metal]: https://github.com/vllm-project/vllm-metal
+[vLLM-metal]: vllm-metal.md
[Website]: https://vllm.ai/