chore(kb/ai): review and expand notes

2026-02-22 03:24:25 +00:00 · 2026-02-20 22:20:45 +01:00
parent c862d1208c
commit e2cfa1d235
4 changed files with 89 additions and 5 deletions
--- a/base/ai/claude/claude
+++ b/base/ai/claude/claude
@@ -11,6 +11,8 @@ Works in a terminal, IDE, browser, and as a desktop app.
 1. [TL;DR](#tldr)
 1. [Grant access to tools](#grant-access-to-tools)
 1. [Using skills](#using-skills)
+1. [Limit tool execution](#limit-tool-execution)
+1. [Memory](#memory)
 1. [Run on local models](#run-on-local-models)
 1. [Further readings](#further-readings)
   1. [Sources](#sources)
@@ -259,6 +261,44 @@ Reference optional files in `SKILL.md` to instruct Claude of what they contain a
 > [!tip]
 > Prefer keeping `SKILL.md` under 500 lines. Move detailed reference material to supporting files.

+## Limit tool execution
+
+Leverage [Sandboxing][documentation/sandboxing] to provide filesystem and network isolation for tool execution.<br/>
+The sandboxed bash tool uses OS-level primitives to enforce defined boundaries upfront, and controls network access
+through a proxy server running outside the sandbox.<br/>
+Attempts to access resources outside the sandbox trigger immediate notifications.
+
+> [!warning]
+> Effective sandboxing requires **both** filesystem and network isolation.<br/>
+> Without network isolation, compromised agents could exfiltrate sensitive files like SSH keys.<br/>
+> Without filesystem isolation, compromised agents could backdoor system resources to gain network access.<br/>
+> When configuring sandboxing, it is important to ensure that configured settings do not bypass these systems.
+
+The sandboxed tool:
+
+- Grants _default_ read and write access to the current working directory and its subdirectories.
+- Grants _default_ read access to the entire computer, except specific denied directories.
+- Blocks modifying files outside the current working directory without **explicit** permission.
+- Allows defining custom allowed and denied paths through settings.
+- Allows accessing only approved domains.
+- Prompts the user when tools request access to new domains.
+- Allows implementing custom rules on **outgoing** traffic.
+- Applies restrictions to all scripts, programs, and subprocesses spawned by commands.
+
+On Mac OS X, Claude Code uses the built-in Seatbelt framework. On Linux and WSL2, it requires installing
+[containers/bubblewrap] before activation.
+
+Sandboxes _can_ be configured to execute commands within the sandbox **without** requiring approval.<br/>
+Commands that cannot be sandboxed fall back to the regular permission flow.
+
+Customize sandbox behavior through the `settings.json` file.
+
+## Memory
+
+TODO
+
+Refer [Manage Claude's memory][documentation/manage claude's memory].
+
 ## Run on local models

 Claude _can_ use other models and engines by setting the `ANTHROPIC_AUTH_TOKEN`, `ANTHROPIC_BASE_URL` and
@@ -338,6 +378,8 @@ Claude Code version: `v2.1.41`.<br/>
 [Blog]: https://claude.com/blog
 [Codebase]: https://github.com/anthropics/claude-code
 [Documentation]: https://code.claude.com/docs/en/overview
+[Documentation/Manage Claude's memory]: https://code.claude.com/docs/en/memory
+[Documentation/Sandboxing]: https://code.claude.com/docs/en/sandboxing
 [Documentation/Skills]: https://code.claude.com/docs/en/skills
 [Website]: https://claude.com/product/overview

@@ -345,6 +387,7 @@ Claude Code version: `v2.1.41`.<br/>
 [Agent Skills]: https://agentskills.io/
 [AWS API MCP Server]: https://github.com/awslabs/mcp/tree/main/src/aws-api-mcp-server
 [Claude Skills vs. MCP: A Technical Comparison for AI Workflows]: https://intuitionlabs.ai/articles/claude-skills-vs-mcp
+[containers/bubblewrap]: https://github.com/containers/bubblewrap
 [Cost Explorer MCP Server]: https://github.com/awslabs/mcp/tree/main/src/cost-explorer-mcp-server
 [pffigueiredo/claude-code-sheet.md]: https://gist.github.com/pffigueiredo/252bac8c731f7e8a2fc268c8a965a963
 [Prat011/awesome-llm-skills]: https://github.com/Prat011/awesome-llm-skills
--- a/base/ai/llm.md
+++ b/base/ai/llm.md
@@ -187,7 +187,11 @@ just inferring the next token.
  what those are or how they work. This is causing lack of critical thinking and overreliance.
 - Model training and execution requires resources that are normally not available to the common person. This encourages
  people to depend from, and hence give power to, AI companies.
- Models tend to **not** accept gracefully that they don't know something, and hallucinate as a result.
+- Models tend to **not** accept gracefully that they don't know something, and hallucinate as a result.<br/>
+  More recent techniques are making models more efficient, but they just delay this problem.
+- Models can learn and exhibit deceptive behavior.<br/>
+  Standard techniques could fail to remove it, and instead empower it while creating a false impression of safety.<br/>
+  See [Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training].

 ## Run LLMs Locally

@@ -202,6 +206,7 @@ Refer:

 - [SEQUOIA: Serving exact Llama2-70B on an RTX4090 with half-second per token latency]
 - [Optimizing LLMs for Performance and Accuracy with Post-Training Quantization]
+- [Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training]

 ### Sources

@@ -252,4 +257,5 @@ Refer:
 [Optimizing LLMs for Performance and Accuracy with Post-Training Quantization]: https://developer.nvidia.com/blog/optimizing-llms-for-performance-and-accuracy-with-post-training-quantization/
 [Run LLMs Locally: 6 Simple Methods]: https://www.datacamp.com/tutorial/run-llms-locally-tutorial
 [SEQUOIA: Serving exact Llama2-70B on an RTX4090 with half-second per token latency]: https://infini-ai-lab.github.io/Sequoia-Page/
+[Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training]: https://arxiv.org/abs/2401.05566
 [What is chain of thought (CoT) prompting?]: https://www.ibm.com/think/topics/chain-of-thoughts
--- a/base/ai/vllm.md
+++ b/base/ai/vllm.md
@@ -31,10 +31,13 @@ capabilities, and enterprise-scale LLM serving.
 <details>
  <summary>Setup</summary>

+Prefer using [vllm-project/vllm-metal] on Apple silicon.<br/>
+Install with `curl -fsSL 'https://raw.githubusercontent.com/vllm-project/vllm-metal/main/install.sh' | bash`
+
 ```sh
 pip install 'vllm'
 pipx install 'vllm'
-uv tool install 'vllm'  # 'vllm-metal' on apple silicon
+uv tool install 'vllm'
 ```

 </details>
@@ -43,8 +46,35 @@ uv tool install 'vllm'  # 'vllm-metal' on apple silicon
  <summary>Usage</summary>

 ```sh
-vllm serve 'meta-llama/Llama-2-7b-hf' --port '8000' --gpu-memory-utilization '0.9'
-vllm serve 'meta-llama/Llama-2-70b-hf' --tensor-parallel-size '2' --port '8000'
+# Get help.
+vllm --help
+
+# Start the vLLM OpenAI Compatible API server.
+vllm serve 'meta-llama/Llama-2-7b-hf'
+vllm serve … --port '8000' --gpu-memory-utilization '0.9'
+vllm serve … --tensor-parallel-size '2' --uds '/tmp/vllm.sock'
+
+# Chat.
+vllm chat
+vllm chat --url 'http://vllm.example.org:8000/v1'
+vllm chat --quick "hi"
+
+# Generate text completion.
+vllm complete
+vllm complete --url 'http://vllm.example.org:8000/v1'
+vllm complete --quick "The future of AI is"
+
+# Bench vLLM.
+vllm bench latency --model '…' --input-len '32' --output-len '1' --enforce-eager --load-format 'dummy'
+vllm bench serve --host 'localhost' --port '8000' --model '…' \
+  --random-input-len '32' --random-output-len '4' --num-prompts '5'
+vllm bench throughput --model '…' --input-len '32' --output-len '1' --enforce-eager --load-format 'dummy'
+
+# Run prompts in batch and save results to files.
+vllm run-batch --input-file 'offline_inference/openai_batch/openai_example_batch.jsonl' --output-file 'results.jsonl' \
+  --model 'meta-llama/Meta-Llama-3-8B-Instruct'
+vllm run-batch --model 'meta-llama/Meta-Llama-3-8B-Instruct' -o 'results.jsonl' \
+  -i 'https://raw.githubusercontent.com/vllm-project/vllm/main/examples/offline_inference/openai_batch/openai_example_batch.jsonl'
 ```

 </details>
@@ -79,8 +109,9 @@ vllm serve 'meta-llama/Llama-2-70b-hf' --tensor-parallel-size '2' --port '8000'
 <!-- Files -->
 <!-- Upstream -->
 [Blog]: https://blog.vllm.ai/
-[Codebase]: https://github.com/vllm-project/vllm
+[Codebase]: https://github.com/vllm-project/
 [Documentation]: https://docs.vllm.ai/en/
+[vllm-project/vllm-metal]: https://github.com/vllm-project/vllm-metal
 [Website]: https://vllm.ai/

 <!-- Others -->
--- a/base/uv.md
+++ b/base/uv.md
@@ -36,6 +36,10 @@ uv tool list
 uv tool run 'vllm'
 uvx 'vllm'          # alias for `uv tool run`

+# Create virtual environments.
+uv venv '.venv'
+uv venv '.venv' --allow-existing --python 'python3.12' --seed
+
 # Clear the cache.
 uv cache clean
 ```