diff --git a/knowledge base/ai/claude/claude code.md b/knowledge base/ai/claude/claude code.md
index 7c17c47..78139e1 100644
--- a/knowledge base/ai/claude/claude code.md
+++ b/knowledge base/ai/claude/claude code.md
@@ -11,6 +11,8 @@ Works in a terminal, IDE, browser, and as a desktop app.
1. [TL;DR](#tldr)
1. [Grant access to tools](#grant-access-to-tools)
1. [Using skills](#using-skills)
+1. [Limit tool execution](#limit-tool-execution)
+1. [Memory](#memory)
1. [Run on local models](#run-on-local-models)
1. [Further readings](#further-readings)
1. [Sources](#sources)
@@ -259,6 +261,44 @@ Reference optional files in `SKILL.md` to instruct Claude of what they contain a
> [!tip]
> Prefer keeping `SKILL.md` under 500 lines. Move detailed reference material to supporting files.
+## Limit tool execution
+
+Leverage [Sandboxing][documentation/sandboxing] to provide filesystem and network isolation for tool execution.
+The sandboxed bash tool uses OS-level primitives to enforce defined boundaries upfront, and controls network access
+through a proxy server running outside the sandbox.
+Attempts to access resources outside the sandbox trigger immediate notifications.
+
+> [!warning]
+> Effective sandboxing requires **both** filesystem and network isolation.
+> Without network isolation, compromised agents could exfiltrate sensitive files like SSH keys.
+> Without filesystem isolation, compromised agents could backdoor system resources to gain network access.
+> When configuring sandboxing, it is important to ensure that configured settings do not bypass these systems.
+
+The sandboxed tool:
+
+- Grants _default_ read and write access to the current working directory and its subdirectories.
+- Grants _default_ read access to the entire computer, except specific denied directories.
+- Blocks modifying files outside the current working directory without **explicit** permission.
+- Allows defining custom allowed and denied paths through settings.
+- Allows accessing only approved domains.
+- Prompts the user when tools request access to new domains.
+- Allows implementing custom rules on **outgoing** traffic.
+- Applies restrictions to all scripts, programs, and subprocesses spawned by commands.
+
+On Mac OS X, Claude Code uses the built-in Seatbelt framework. On Linux and WSL2, it requires installing
+[containers/bubblewrap] before activation.
+
+Sandboxes _can_ be configured to execute commands within the sandbox **without** requiring approval.
+Commands that cannot be sandboxed fall back to the regular permission flow.
+
+Customize sandbox behavior through the `settings.json` file.
+
+## Memory
+
+TODO
+
+Refer [Manage Claude's memory][documentation/manage claude's memory].
+
## Run on local models
Claude _can_ use other models and engines by setting the `ANTHROPIC_AUTH_TOKEN`, `ANTHROPIC_BASE_URL` and
@@ -338,6 +378,8 @@ Claude Code version: `v2.1.41`.
[Blog]: https://claude.com/blog
[Codebase]: https://github.com/anthropics/claude-code
[Documentation]: https://code.claude.com/docs/en/overview
+[Documentation/Manage Claude's memory]: https://code.claude.com/docs/en/memory
+[Documentation/Sandboxing]: https://code.claude.com/docs/en/sandboxing
[Documentation/Skills]: https://code.claude.com/docs/en/skills
[Website]: https://claude.com/product/overview
@@ -345,6 +387,7 @@ Claude Code version: `v2.1.41`.
[Agent Skills]: https://agentskills.io/
[AWS API MCP Server]: https://github.com/awslabs/mcp/tree/main/src/aws-api-mcp-server
[Claude Skills vs. MCP: A Technical Comparison for AI Workflows]: https://intuitionlabs.ai/articles/claude-skills-vs-mcp
+[containers/bubblewrap]: https://github.com/containers/bubblewrap
[Cost Explorer MCP Server]: https://github.com/awslabs/mcp/tree/main/src/cost-explorer-mcp-server
[pffigueiredo/claude-code-sheet.md]: https://gist.github.com/pffigueiredo/252bac8c731f7e8a2fc268c8a965a963
[Prat011/awesome-llm-skills]: https://github.com/Prat011/awesome-llm-skills
diff --git a/knowledge base/ai/llm.md b/knowledge base/ai/llm.md
index a6fb8c0..704db56 100644
--- a/knowledge base/ai/llm.md
+++ b/knowledge base/ai/llm.md
@@ -187,7 +187,11 @@ just inferring the next token.
what those are or how they work. This is causing lack of critical thinking and overreliance.
- Model training and execution requires resources that are normally not available to the common person. This encourages
people to depend from, and hence give power to, AI companies.
-- Models tend to **not** accept gracefully that they don't know something, and hallucinate as a result.
+- Models tend to **not** accept gracefully that they don't know something, and hallucinate as a result.
+ More recent techniques are making models more efficient, but they just delay this problem.
+- Models can learn and exhibit deceptive behavior.
+ Standard techniques could fail to remove it, and instead empower it while creating a false impression of safety.
+ See [Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training].
## Run LLMs Locally
@@ -202,6 +206,7 @@ Refer:
- [SEQUOIA: Serving exact Llama2-70B on an RTX4090 with half-second per token latency]
- [Optimizing LLMs for Performance and Accuracy with Post-Training Quantization]
+- [Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training]
### Sources
@@ -252,4 +257,5 @@ Refer:
[Optimizing LLMs for Performance and Accuracy with Post-Training Quantization]: https://developer.nvidia.com/blog/optimizing-llms-for-performance-and-accuracy-with-post-training-quantization/
[Run LLMs Locally: 6 Simple Methods]: https://www.datacamp.com/tutorial/run-llms-locally-tutorial
[SEQUOIA: Serving exact Llama2-70B on an RTX4090 with half-second per token latency]: https://infini-ai-lab.github.io/Sequoia-Page/
+[Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training]: https://arxiv.org/abs/2401.05566
[What is chain of thought (CoT) prompting?]: https://www.ibm.com/think/topics/chain-of-thoughts
diff --git a/knowledge base/ai/vllm.md b/knowledge base/ai/vllm.md
index f6e4a1d..559088e 100644
--- a/knowledge base/ai/vllm.md
+++ b/knowledge base/ai/vllm.md
@@ -31,10 +31,13 @@ capabilities, and enterprise-scale LLM serving.
Setup
+Prefer using [vllm-project/vllm-metal] on Apple silicon.
+Install with `curl -fsSL 'https://raw.githubusercontent.com/vllm-project/vllm-metal/main/install.sh' | bash`
+
```sh
pip install 'vllm'
pipx install 'vllm'
-uv tool install 'vllm' # 'vllm-metal' on apple silicon
+uv tool install 'vllm'
```
@@ -43,8 +46,35 @@ uv tool install 'vllm' # 'vllm-metal' on apple silicon
Usage
```sh
-vllm serve 'meta-llama/Llama-2-7b-hf' --port '8000' --gpu-memory-utilization '0.9'
-vllm serve 'meta-llama/Llama-2-70b-hf' --tensor-parallel-size '2' --port '8000'
+# Get help.
+vllm --help
+
+# Start the vLLM OpenAI Compatible API server.
+vllm serve 'meta-llama/Llama-2-7b-hf'
+vllm serve … --port '8000' --gpu-memory-utilization '0.9'
+vllm serve … --tensor-parallel-size '2' --uds '/tmp/vllm.sock'
+
+# Chat.
+vllm chat
+vllm chat --url 'http://vllm.example.org:8000/v1'
+vllm chat --quick "hi"
+
+# Generate text completion.
+vllm complete
+vllm complete --url 'http://vllm.example.org:8000/v1'
+vllm complete --quick "The future of AI is"
+
+# Bench vLLM.
+vllm bench latency --model '…' --input-len '32' --output-len '1' --enforce-eager --load-format 'dummy'
+vllm bench serve --host 'localhost' --port '8000' --model '…' \
+ --random-input-len '32' --random-output-len '4' --num-prompts '5'
+vllm bench throughput --model '…' --input-len '32' --output-len '1' --enforce-eager --load-format 'dummy'
+
+# Run prompts in batch and save results to files.
+vllm run-batch --input-file 'offline_inference/openai_batch/openai_example_batch.jsonl' --output-file 'results.jsonl' \
+ --model 'meta-llama/Meta-Llama-3-8B-Instruct'
+vllm run-batch --model 'meta-llama/Meta-Llama-3-8B-Instruct' -o 'results.jsonl' \
+ -i 'https://raw.githubusercontent.com/vllm-project/vllm/main/examples/offline_inference/openai_batch/openai_example_batch.jsonl'
```
@@ -79,8 +109,9 @@ vllm serve 'meta-llama/Llama-2-70b-hf' --tensor-parallel-size '2' --port '8000'
[Blog]: https://blog.vllm.ai/
-[Codebase]: https://github.com/vllm-project/vllm
+[Codebase]: https://github.com/vllm-project/
[Documentation]: https://docs.vllm.ai/en/
+[vllm-project/vllm-metal]: https://github.com/vllm-project/vllm-metal
[Website]: https://vllm.ai/
diff --git a/knowledge base/uv.md b/knowledge base/uv.md
index bf8b983..a414dd6 100644
--- a/knowledge base/uv.md
+++ b/knowledge base/uv.md
@@ -36,6 +36,10 @@ uv tool list
uv tool run 'vllm'
uvx 'vllm' # alias for `uv tool run`
+# Create virtual environments.
+uv venv '.venv'
+uv venv '.venv' --allow-existing --python 'python3.12' --seed
+
# Clear the cache.
uv cache clean
```