diff --git a/knowledge base/ai/claude/claude code.md b/knowledge base/ai/claude/claude code.md index 7c17c47..78139e1 100644 --- a/knowledge base/ai/claude/claude code.md +++ b/knowledge base/ai/claude/claude code.md @@ -11,6 +11,8 @@ Works in a terminal, IDE, browser, and as a desktop app. 1. [TL;DR](#tldr) 1. [Grant access to tools](#grant-access-to-tools) 1. [Using skills](#using-skills) +1. [Limit tool execution](#limit-tool-execution) +1. [Memory](#memory) 1. [Run on local models](#run-on-local-models) 1. [Further readings](#further-readings) 1. [Sources](#sources) @@ -259,6 +261,44 @@ Reference optional files in `SKILL.md` to instruct Claude of what they contain a > [!tip] > Prefer keeping `SKILL.md` under 500 lines. Move detailed reference material to supporting files. +## Limit tool execution + +Leverage [Sandboxing][documentation/sandboxing] to provide filesystem and network isolation for tool execution.
+The sandboxed bash tool uses OS-level primitives to enforce defined boundaries upfront, and controls network access +through a proxy server running outside the sandbox.
+Attempts to access resources outside the sandbox trigger immediate notifications. + +> [!warning] +> Effective sandboxing requires **both** filesystem and network isolation.
+> Without network isolation, compromised agents could exfiltrate sensitive files like SSH keys.
+> Without filesystem isolation, compromised agents could backdoor system resources to gain network access.
+> When configuring sandboxing, it is important to ensure that configured settings do not bypass these systems. + +The sandboxed tool: + +- Grants _default_ read and write access to the current working directory and its subdirectories. +- Grants _default_ read access to the entire computer, except specific denied directories. +- Blocks modifying files outside the current working directory without **explicit** permission. +- Allows defining custom allowed and denied paths through settings. +- Allows accessing only approved domains. +- Prompts the user when tools request access to new domains. +- Allows implementing custom rules on **outgoing** traffic. +- Applies restrictions to all scripts, programs, and subprocesses spawned by commands. + +On Mac OS X, Claude Code uses the built-in Seatbelt framework. On Linux and WSL2, it requires installing +[containers/bubblewrap] before activation. + +Sandboxes _can_ be configured to execute commands within the sandbox **without** requiring approval.
+Commands that cannot be sandboxed fall back to the regular permission flow. + +Customize sandbox behavior through the `settings.json` file. + +## Memory + +TODO + +Refer [Manage Claude's memory][documentation/manage claude's memory]. + ## Run on local models Claude _can_ use other models and engines by setting the `ANTHROPIC_AUTH_TOKEN`, `ANTHROPIC_BASE_URL` and @@ -338,6 +378,8 @@ Claude Code version: `v2.1.41`.
[Blog]: https://claude.com/blog [Codebase]: https://github.com/anthropics/claude-code [Documentation]: https://code.claude.com/docs/en/overview +[Documentation/Manage Claude's memory]: https://code.claude.com/docs/en/memory +[Documentation/Sandboxing]: https://code.claude.com/docs/en/sandboxing [Documentation/Skills]: https://code.claude.com/docs/en/skills [Website]: https://claude.com/product/overview @@ -345,6 +387,7 @@ Claude Code version: `v2.1.41`.
[Agent Skills]: https://agentskills.io/ [AWS API MCP Server]: https://github.com/awslabs/mcp/tree/main/src/aws-api-mcp-server [Claude Skills vs. MCP: A Technical Comparison for AI Workflows]: https://intuitionlabs.ai/articles/claude-skills-vs-mcp +[containers/bubblewrap]: https://github.com/containers/bubblewrap [Cost Explorer MCP Server]: https://github.com/awslabs/mcp/tree/main/src/cost-explorer-mcp-server [pffigueiredo/claude-code-sheet.md]: https://gist.github.com/pffigueiredo/252bac8c731f7e8a2fc268c8a965a963 [Prat011/awesome-llm-skills]: https://github.com/Prat011/awesome-llm-skills diff --git a/knowledge base/ai/llm.md b/knowledge base/ai/llm.md index a6fb8c0..704db56 100644 --- a/knowledge base/ai/llm.md +++ b/knowledge base/ai/llm.md @@ -187,7 +187,11 @@ just inferring the next token. what those are or how they work. This is causing lack of critical thinking and overreliance. - Model training and execution requires resources that are normally not available to the common person. This encourages people to depend from, and hence give power to, AI companies. -- Models tend to **not** accept gracefully that they don't know something, and hallucinate as a result. +- Models tend to **not** accept gracefully that they don't know something, and hallucinate as a result.
+ More recent techniques are making models more efficient, but they just delay this problem. +- Models can learn and exhibit deceptive behavior.
+ Standard techniques could fail to remove it, and instead empower it while creating a false impression of safety.
+ See [Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training]. ## Run LLMs Locally @@ -202,6 +206,7 @@ Refer: - [SEQUOIA: Serving exact Llama2-70B on an RTX4090 with half-second per token latency] - [Optimizing LLMs for Performance and Accuracy with Post-Training Quantization] +- [Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training] ### Sources @@ -252,4 +257,5 @@ Refer: [Optimizing LLMs for Performance and Accuracy with Post-Training Quantization]: https://developer.nvidia.com/blog/optimizing-llms-for-performance-and-accuracy-with-post-training-quantization/ [Run LLMs Locally: 6 Simple Methods]: https://www.datacamp.com/tutorial/run-llms-locally-tutorial [SEQUOIA: Serving exact Llama2-70B on an RTX4090 with half-second per token latency]: https://infini-ai-lab.github.io/Sequoia-Page/ +[Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training]: https://arxiv.org/abs/2401.05566 [What is chain of thought (CoT) prompting?]: https://www.ibm.com/think/topics/chain-of-thoughts diff --git a/knowledge base/ai/vllm.md b/knowledge base/ai/vllm.md index f6e4a1d..559088e 100644 --- a/knowledge base/ai/vllm.md +++ b/knowledge base/ai/vllm.md @@ -31,10 +31,13 @@ capabilities, and enterprise-scale LLM serving.
Setup +Prefer using [vllm-project/vllm-metal] on Apple silicon.
+Install with `curl -fsSL 'https://raw.githubusercontent.com/vllm-project/vllm-metal/main/install.sh' | bash` + ```sh pip install 'vllm' pipx install 'vllm' -uv tool install 'vllm' # 'vllm-metal' on apple silicon +uv tool install 'vllm' ```
@@ -43,8 +46,35 @@ uv tool install 'vllm' # 'vllm-metal' on apple silicon Usage ```sh -vllm serve 'meta-llama/Llama-2-7b-hf' --port '8000' --gpu-memory-utilization '0.9' -vllm serve 'meta-llama/Llama-2-70b-hf' --tensor-parallel-size '2' --port '8000' +# Get help. +vllm --help + +# Start the vLLM OpenAI Compatible API server. +vllm serve 'meta-llama/Llama-2-7b-hf' +vllm serve … --port '8000' --gpu-memory-utilization '0.9' +vllm serve … --tensor-parallel-size '2' --uds '/tmp/vllm.sock' + +# Chat. +vllm chat +vllm chat --url 'http://vllm.example.org:8000/v1' +vllm chat --quick "hi" + +# Generate text completion. +vllm complete +vllm complete --url 'http://vllm.example.org:8000/v1' +vllm complete --quick "The future of AI is" + +# Bench vLLM. +vllm bench latency --model '…' --input-len '32' --output-len '1' --enforce-eager --load-format 'dummy' +vllm bench serve --host 'localhost' --port '8000' --model '…' \ + --random-input-len '32' --random-output-len '4' --num-prompts '5' +vllm bench throughput --model '…' --input-len '32' --output-len '1' --enforce-eager --load-format 'dummy' + +# Run prompts in batch and save results to files. +vllm run-batch --input-file 'offline_inference/openai_batch/openai_example_batch.jsonl' --output-file 'results.jsonl' \ + --model 'meta-llama/Meta-Llama-3-8B-Instruct' +vllm run-batch --model 'meta-llama/Meta-Llama-3-8B-Instruct' -o 'results.jsonl' \ + -i 'https://raw.githubusercontent.com/vllm-project/vllm/main/examples/offline_inference/openai_batch/openai_example_batch.jsonl' ``` @@ -79,8 +109,9 @@ vllm serve 'meta-llama/Llama-2-70b-hf' --tensor-parallel-size '2' --port '8000' [Blog]: https://blog.vllm.ai/ -[Codebase]: https://github.com/vllm-project/vllm +[Codebase]: https://github.com/vllm-project/ [Documentation]: https://docs.vllm.ai/en/ +[vllm-project/vllm-metal]: https://github.com/vllm-project/vllm-metal [Website]: https://vllm.ai/ diff --git a/knowledge base/uv.md b/knowledge base/uv.md index bf8b983..a414dd6 100644 --- a/knowledge base/uv.md +++ b/knowledge base/uv.md @@ -36,6 +36,10 @@ uv tool list uv tool run 'vllm' uvx 'vllm' # alias for `uv tool run` +# Create virtual environments. +uv venv '.venv' +uv venv '.venv' --allow-existing --python 'python3.12' --seed + # Clear the cache. uv cache clean ```