5.0 KiB
Claude Code
TODO
Agentic coding tool that reads and edits files, runs commands, and integrates with tools.
Works in a terminal, IDE, browser, and as a desktop app.
TL;DR
Warning
Normally requires an Anthropic account to be used.
One can use Claude Code router or Ollama to run on a locally server or shared LLM instead.
Uses a scope system to determine where configurations apply and who they're shared with.
When multiple scopes are active, the more specific ones take precedence.
| Scope | Location | Area of effect | Shared |
|---|---|---|---|
| Managed (A.K.A. System) | System-level managed-settings.json |
All users on the host | Yes (usually deployed by IT) |
| User | ~/.claude/ directory |
Single user, across all projects | No |
| Project | .claude/ directory in a repository |
All collaborators, repository only | Yes (usually committed to the repository) |
| Local | .claude/*.local.* files |
Single user, repository only | No (usually gitignored) |
Setup
brew install --cask 'claude-code'
Usage
# Start in interactive mode.
claude
# Run a one-time task.
claude "fix the build error"
# Run a one-off task, then exit.
claude -p 'Hi! Are you there?'
claude -p "explain this function"
# Resume the most recent conversation that happened in the current directory
claude -c
# Resume a previous conversation
claude -r
Real world use cases
# Run Claude Code on a model served locally by Ollama.
ANTHROPIC_AUTH_TOKEN='ollama' ANTHROPIC_BASE_URL='http://localhost:11434' ANTHROPIC_API_KEY='' \
claude --model 'lfm2.5-thinking:1.2b'
Run on local models
Claude can use other models and engines by setting the ANTHROPIC_AUTH_TOKEN, ANTHROPIC_BASE_URL and
ANTHROPIC_API_KEY environment variables.
E.g.:
# Run Claude Code on a model served locally by Ollama.
ANTHROPIC_AUTH_TOKEN='ollama' ANTHROPIC_BASE_URL='http://localhost:11434' ANTHROPIC_API_KEY='' \
claude --model 'lfm2.5-thinking:1.2b'
Warning
Performances do tend to drop substantially depending on the context size and the executing host.
Examples
Prompt: Hi! Are you there?.
The model was run once right before the tests started to remove loading times.
Requests have been sent in headless mode (claude -p 'prompt').
glm-4.7-flash:q4_K_M on an M3 Pro MacBook Pro 36 GB
Model: glm-4.7-flash:q4_K_M.
Host: M3 Pro MacBook Pro 36 GB.
Claude Code version: v2.1.41.
| Engine | Context | RAM usage | Used swap | Average response time | System remained responsive |
|---|---|---|---|---|---|
| llama.cpp (ollama) | 4096 | 19 GB | No | 19s | No |
| llama.cpp (ollama) | 8192 | 19 GB | No | 48s | No |
| llama.cpp (ollama) | 16384 | 20 GB | No | 2m 16s | No |
| llama.cpp (ollama) | 32768 | 22 GB | No | 7.12s | No |
| llama.cpp (ollama) | 65536 | 25 GB | No? (unsure) | 10.25s | Meh (minor stutters) |
| llama.cpp (ollama) | 131072 | 33 GB | No | 3m 42s | No (major stutters) |