chore(kb/ai): review and expand notes

This commit is contained in:
Michele Cereda
2026-02-20 02:19:19 +01:00
parent 272661fd34
commit c862d1208c
6 changed files with 121 additions and 16 deletions

View File

@@ -1,12 +1,10 @@
# AI agent
> [!caution]
> TODO
AI-enabled system or application capable of autonomously performing tasks of various complexity levels on their own,
possibly **without** the need to stop to ask permission or consent to the user.
AI-enabled system or application capable of _autonomously_ performing tasks of various complexity levels by designing
workflows and using the tools made available to them.
1. [TL;DR](#tldr)
1. [Skills](#skills)
1. [Concerns](#concerns)
1. [How much context is too much?](#how-much-context-is-too-much)
1. [Security](#security)
@@ -17,10 +15,14 @@ possibly **without** the need to stop to ask permission or consent to the user.
## TL;DR
Agents design their workflow and utilize the tools that are made available to them.<br/>
They use natural language processing techniques of [LLMs][large language model] to comprehend user inputs, respond to
them step-by-step, and determine when to call on external tools to obtain up-to-date information, optimize workflows
and create subtasks autonomously to achieve complex goals.
AI agents can encompass a wide range of functions beyond natural language processing.<br/>
These functions include making decision, problem-solving, interacting with external environments, and performing
actions.
Agents design their own workflow and utilize the tools that are made available to them.<br/>
They use [LLMs][large language model] to comprehend user inputs, deconstruct and respond to requests step-by-step,
determine when to call on external tools to obtain up-to-date information, optimize workflows, and autonomously create
subtasks to achieve complex goals.
Traditional software is _deterministic_, AI is _probabilistic_.
@@ -44,6 +46,13 @@ them during the run.
Prefer **requiring** consent by agents when running them.
## Skills
Skills extend AI agent capabilities with specialized knowledge and workflow definitions.
[Agent Skills] is an open standard for skills. It defines them as folders of instructions, scripts, and resources that
agents can discover and use to do things more accurately and efficiently.
## Concerns
Agents created by Anthropic and other companies have a history of not caring about agent abuse, and leave users on
@@ -143,6 +152,7 @@ See [An AI Agent Published a Hit Piece on Me] by Scott Shambaugh.
<!-- Others -->
[39C3 - Agentic ProbLLMs: Exploiting AI Computer-Use and Coding Agents]: https://www.youtube.com/watch?v=8pbz5y7_WkM
[39C3 - AI Agent, AI Spy]: https://www.youtube.com/watch?v=0ANECpNdt-4
[Agent Skills]: https://agentskills.io/
[Agentic ProbLLMs - The Month of AI Bugs]: https://monthofaibugs.com/
[AI Doesn't Reduce Work — It Intensifies It]: https://hbr.org/2026/02/ai-doesnt-reduce-work-it-intensifies-it
[An AI Agent Published a Hit Piece on Me]: https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me/

View File

@@ -203,9 +203,13 @@ Manually add the MCP server definition to `$HOME/.claude.json`:
## Using skills
Refer [Skills][documentation/skills].
Refer [Skills][documentation/skills].<br/>
See also:
See also [create custom skills] and [Prat011/awesome-llm-skills].
- [Create custom skills].
- [Prat011/awesome-llm-skills].
Claude Skills follow and extend the [Agent Skills] standard format.
Skills superseded commands.<br/>
Existing `.claude/commands/` files will currently still work, but skills with the same name will take precedence.
@@ -308,6 +312,7 @@ Claude Code version: `v2.1.41`.<br/>
- [Gemini CLI]
- [OpenCode]
- [Prat011/awesome-llm-skills]
- [Claude Skills vs. MCP: A Technical Comparison for AI Workflows]
### Sources
@@ -337,7 +342,9 @@ Claude Code version: `v2.1.41`.<br/>
[Website]: https://claude.com/product/overview
<!-- Others -->
[Agent Skills]: https://agentskills.io/
[AWS API MCP Server]: https://github.com/awslabs/mcp/tree/main/src/aws-api-mcp-server
[Claude Skills vs. MCP: A Technical Comparison for AI Workflows]: https://intuitionlabs.ai/articles/claude-skills-vs-mcp
[Cost Explorer MCP Server]: https://github.com/awslabs/mcp/tree/main/src/cost-explorer-mcp-server
[pffigueiredo/claude-code-sheet.md]: https://gist.github.com/pffigueiredo/252bac8c731f7e8a2fc268c8a965a963
[Prat011/awesome-llm-skills]: https://github.com/Prat011/awesome-llm-skills

View File

@@ -41,7 +41,7 @@ llama-cli -m 'path/to/target/model.gguf' -md 'path/to/draft/model.gguf'
# Download and run models.
llama-cli -mu 'https://example.org/some/model' # URL
llama-cli -hf 'ggml-org/gemma-3-1b-it-GGUF' -c '32.768' # Hugging Face
llama-cli -dr 'ai/qwen2.5-coder' --offline # Docker Hub
llama-cli -dr 'ai/qwen2.5' --offline # Docker Hub
# Launch the OpenAI-compatible API server.
llama-server -m 'path/to/model.gguf'

View File

@@ -133,11 +133,17 @@ is correct by breaking questions in smaller, more manageable steps, and solving
final answer.<br/>
The result is more accurate, but it costs more tokens and requires a bigger context window.
The _ReAct loop_ (reason+act) forces models to loop over chain of thoughts.<br/>
A model breaks the request in smaller steps, acts on those using [functions][function calling] if they deem it useful,
checks the results, updates the chain of thoughts, and repeat until the request is satisfied.
The _ReAct loop_ (Reason + Act) paradigm forces models to loop over chain-of-thoughts.<br/>
A model breaks the request in smaller steps, plans the next action, acts on it using [functions][function calling]
should it decide it needs to, checks the results, updates the chain of thoughts, and repeats this Think-Act-Observe loop
to iteratively improve upon responses.
Next step is [agentic AI][agent].
The _ReWOO_ (Reasoning WithOut Observation) method eliminates the dependence on tool outputs for action planning.<br/>
Models plan upfront, and avoid redundant usage of tools by anticipating which tools to use upon receiving the initial
prompt from the user.<br/>
Users can confirm the plan **before** the model executes it.
[AI agents][agent] use these methods to act autonomously.
## Prompting
@@ -195,6 +201,7 @@ Refer:
## Further readings
- [SEQUOIA: Serving exact Llama2-70B on an RTX4090 with half-second per token latency]
- [Optimizing LLMs for Performance and Accuracy with Post-Training Quantization]
### Sources
@@ -203,6 +210,7 @@ Refer:
- [Local LLM Hosting: Complete 2026 Guide - Ollama, vLLM, LocalAI, Jan, LM Studio & More]
- [LLM skills every AI engineer must know]
- [Function calling in LLMs]
- [What is chain of thought (CoT) prompting?]
<!--
Reference
@@ -241,5 +249,7 @@ Refer:
[Looking back at speculative decoding]: https://research.google/blog/looking-back-at-speculative-decoding/
[Mistral]: https://mistral.ai/
[OpenClaw: Who are you?]: https://www.youtube.com/watch?v=hoeEclqW8Gs
[Optimizing LLMs for Performance and Accuracy with Post-Training Quantization]: https://developer.nvidia.com/blog/optimizing-llms-for-performance-and-accuracy-with-post-training-quantization/
[Run LLMs Locally: 6 Simple Methods]: https://www.datacamp.com/tutorial/run-llms-locally-tutorial
[SEQUOIA: Serving exact Llama2-70B on an RTX4090 with half-second per token latency]: https://infini-ai-lab.github.io/Sequoia-Page/
[What is chain of thought (CoT) prompting?]: https://www.ibm.com/think/topics/chain-of-thoughts

View File

@@ -34,6 +34,7 @@ capabilities, and enterprise-scale LLM serving.
```sh
pip install 'vllm'
pipx install 'vllm'
uv tool install 'vllm' # 'vllm-metal' on apple silicon
```
</details>

77
knowledge base/uv.md Normal file
View File

@@ -0,0 +1,77 @@
# uv
Fast Python package and project manager written in Rust.
1. [TL;DR](#tldr)
1. [Further readings](#further-readings)
1. [Sources](#sources)
## TL;DR
<details>
<summary>Setup</summary>
```sh
brew install 'uv'
docker pull 'ghcr.io/astral-sh/uv:0.10.4-python3.12-trixie'
pip install 'uv'
pipx install 'uv'
```
</details>
<details>
<summary>Usage</summary>
```sh
# Install applications.
# Similar to `pipx install`.
uv tool install 'ansible'
# List installed applications.
uv tool list
# Run applications.
# Similar to `pipx run`.
uv tool run 'vllm'
uvx 'vllm' # alias for `uv tool run`
# Clear the cache.
uv cache clean
```
</details>
<!-- Uncomment if used
<details>
<summary>Real world use cases</summary>
```sh
```
</details>
-->
## Further readings
- [Website]
- [Codebase]
### Sources
- [Documentation]
<!--
Reference
═╬═Time══
-->
<!-- In-article sections -->
<!-- Knowledge base -->
<!-- Files -->
<!-- Upstream -->
[Codebase]: https://github.com/astral-sh/uv
[Documentation]: https://docs.astral.sh/uv/
[Website]: https://docs.astral.sh/uv/
<!-- Others -->