mirror of
https://gitea.com/mcereda/oam.git
synced 2026-02-21 19:14:24 +00:00
chore(kb/ai): review and expand notes
This commit is contained in:
@@ -1,12 +1,10 @@
|
||||
# AI agent
|
||||
|
||||
> [!caution]
|
||||
> TODO
|
||||
|
||||
AI-enabled system or application capable of autonomously performing tasks of various complexity levels on their own,
|
||||
possibly **without** the need to stop to ask permission or consent to the user.
|
||||
AI-enabled system or application capable of _autonomously_ performing tasks of various complexity levels by designing
|
||||
workflows and using the tools made available to them.
|
||||
|
||||
1. [TL;DR](#tldr)
|
||||
1. [Skills](#skills)
|
||||
1. [Concerns](#concerns)
|
||||
1. [How much context is too much?](#how-much-context-is-too-much)
|
||||
1. [Security](#security)
|
||||
@@ -17,10 +15,14 @@ possibly **without** the need to stop to ask permission or consent to the user.
|
||||
|
||||
## TL;DR
|
||||
|
||||
Agents design their workflow and utilize the tools that are made available to them.<br/>
|
||||
They use natural language processing techniques of [LLMs][large language model] to comprehend user inputs, respond to
|
||||
them step-by-step, and determine when to call on external tools to obtain up-to-date information, optimize workflows
|
||||
and create subtasks autonomously to achieve complex goals.
|
||||
AI agents can encompass a wide range of functions beyond natural language processing.<br/>
|
||||
These functions include making decision, problem-solving, interacting with external environments, and performing
|
||||
actions.
|
||||
|
||||
Agents design their own workflow and utilize the tools that are made available to them.<br/>
|
||||
They use [LLMs][large language model] to comprehend user inputs, deconstruct and respond to requests step-by-step,
|
||||
determine when to call on external tools to obtain up-to-date information, optimize workflows, and autonomously create
|
||||
subtasks to achieve complex goals.
|
||||
|
||||
Traditional software is _deterministic_, AI is _probabilistic_.
|
||||
|
||||
@@ -44,6 +46,13 @@ them during the run.
|
||||
|
||||
Prefer **requiring** consent by agents when running them.
|
||||
|
||||
## Skills
|
||||
|
||||
Skills extend AI agent capabilities with specialized knowledge and workflow definitions.
|
||||
|
||||
[Agent Skills] is an open standard for skills. It defines them as folders of instructions, scripts, and resources that
|
||||
agents can discover and use to do things more accurately and efficiently.
|
||||
|
||||
## Concerns
|
||||
|
||||
Agents created by Anthropic and other companies have a history of not caring about agent abuse, and leave users on
|
||||
@@ -143,6 +152,7 @@ See [An AI Agent Published a Hit Piece on Me] by Scott Shambaugh.
|
||||
<!-- Others -->
|
||||
[39C3 - Agentic ProbLLMs: Exploiting AI Computer-Use and Coding Agents]: https://www.youtube.com/watch?v=8pbz5y7_WkM
|
||||
[39C3 - AI Agent, AI Spy]: https://www.youtube.com/watch?v=0ANECpNdt-4
|
||||
[Agent Skills]: https://agentskills.io/
|
||||
[Agentic ProbLLMs - The Month of AI Bugs]: https://monthofaibugs.com/
|
||||
[AI Doesn't Reduce Work — It Intensifies It]: https://hbr.org/2026/02/ai-doesnt-reduce-work-it-intensifies-it
|
||||
[An AI Agent Published a Hit Piece on Me]: https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me/
|
||||
|
||||
@@ -203,9 +203,13 @@ Manually add the MCP server definition to `$HOME/.claude.json`:
|
||||
|
||||
## Using skills
|
||||
|
||||
Refer [Skills][documentation/skills].
|
||||
Refer [Skills][documentation/skills].<br/>
|
||||
See also:
|
||||
|
||||
See also [create custom skills] and [Prat011/awesome-llm-skills].
|
||||
- [Create custom skills].
|
||||
- [Prat011/awesome-llm-skills].
|
||||
|
||||
Claude Skills follow and extend the [Agent Skills] standard format.
|
||||
|
||||
Skills superseded commands.<br/>
|
||||
Existing `.claude/commands/` files will currently still work, but skills with the same name will take precedence.
|
||||
@@ -308,6 +312,7 @@ Claude Code version: `v2.1.41`.<br/>
|
||||
- [Gemini CLI]
|
||||
- [OpenCode]
|
||||
- [Prat011/awesome-llm-skills]
|
||||
- [Claude Skills vs. MCP: A Technical Comparison for AI Workflows]
|
||||
|
||||
### Sources
|
||||
|
||||
@@ -337,7 +342,9 @@ Claude Code version: `v2.1.41`.<br/>
|
||||
[Website]: https://claude.com/product/overview
|
||||
|
||||
<!-- Others -->
|
||||
[Agent Skills]: https://agentskills.io/
|
||||
[AWS API MCP Server]: https://github.com/awslabs/mcp/tree/main/src/aws-api-mcp-server
|
||||
[Claude Skills vs. MCP: A Technical Comparison for AI Workflows]: https://intuitionlabs.ai/articles/claude-skills-vs-mcp
|
||||
[Cost Explorer MCP Server]: https://github.com/awslabs/mcp/tree/main/src/cost-explorer-mcp-server
|
||||
[pffigueiredo/claude-code-sheet.md]: https://gist.github.com/pffigueiredo/252bac8c731f7e8a2fc268c8a965a963
|
||||
[Prat011/awesome-llm-skills]: https://github.com/Prat011/awesome-llm-skills
|
||||
|
||||
@@ -41,7 +41,7 @@ llama-cli -m 'path/to/target/model.gguf' -md 'path/to/draft/model.gguf'
|
||||
# Download and run models.
|
||||
llama-cli -mu 'https://example.org/some/model' # URL
|
||||
llama-cli -hf 'ggml-org/gemma-3-1b-it-GGUF' -c '32.768' # Hugging Face
|
||||
llama-cli -dr 'ai/qwen2.5-coder' --offline # Docker Hub
|
||||
llama-cli -dr 'ai/qwen2.5' --offline # Docker Hub
|
||||
|
||||
# Launch the OpenAI-compatible API server.
|
||||
llama-server -m 'path/to/model.gguf'
|
||||
|
||||
@@ -133,11 +133,17 @@ is correct by breaking questions in smaller, more manageable steps, and solving
|
||||
final answer.<br/>
|
||||
The result is more accurate, but it costs more tokens and requires a bigger context window.
|
||||
|
||||
The _ReAct loop_ (reason+act) forces models to loop over chain of thoughts.<br/>
|
||||
A model breaks the request in smaller steps, acts on those using [functions][function calling] if they deem it useful,
|
||||
checks the results, updates the chain of thoughts, and repeat until the request is satisfied.
|
||||
The _ReAct loop_ (Reason + Act) paradigm forces models to loop over chain-of-thoughts.<br/>
|
||||
A model breaks the request in smaller steps, plans the next action, acts on it using [functions][function calling]
|
||||
should it decide it needs to, checks the results, updates the chain of thoughts, and repeats this Think-Act-Observe loop
|
||||
to iteratively improve upon responses.
|
||||
|
||||
Next step is [agentic AI][agent].
|
||||
The _ReWOO_ (Reasoning WithOut Observation) method eliminates the dependence on tool outputs for action planning.<br/>
|
||||
Models plan upfront, and avoid redundant usage of tools by anticipating which tools to use upon receiving the initial
|
||||
prompt from the user.<br/>
|
||||
Users can confirm the plan **before** the model executes it.
|
||||
|
||||
[AI agents][agent] use these methods to act autonomously.
|
||||
|
||||
## Prompting
|
||||
|
||||
@@ -195,6 +201,7 @@ Refer:
|
||||
## Further readings
|
||||
|
||||
- [SEQUOIA: Serving exact Llama2-70B on an RTX4090 with half-second per token latency]
|
||||
- [Optimizing LLMs for Performance and Accuracy with Post-Training Quantization]
|
||||
|
||||
### Sources
|
||||
|
||||
@@ -203,6 +210,7 @@ Refer:
|
||||
- [Local LLM Hosting: Complete 2026 Guide - Ollama, vLLM, LocalAI, Jan, LM Studio & More]
|
||||
- [LLM skills every AI engineer must know]
|
||||
- [Function calling in LLMs]
|
||||
- [What is chain of thought (CoT) prompting?]
|
||||
|
||||
<!--
|
||||
Reference
|
||||
@@ -241,5 +249,7 @@ Refer:
|
||||
[Looking back at speculative decoding]: https://research.google/blog/looking-back-at-speculative-decoding/
|
||||
[Mistral]: https://mistral.ai/
|
||||
[OpenClaw: Who are you?]: https://www.youtube.com/watch?v=hoeEclqW8Gs
|
||||
[Optimizing LLMs for Performance and Accuracy with Post-Training Quantization]: https://developer.nvidia.com/blog/optimizing-llms-for-performance-and-accuracy-with-post-training-quantization/
|
||||
[Run LLMs Locally: 6 Simple Methods]: https://www.datacamp.com/tutorial/run-llms-locally-tutorial
|
||||
[SEQUOIA: Serving exact Llama2-70B on an RTX4090 with half-second per token latency]: https://infini-ai-lab.github.io/Sequoia-Page/
|
||||
[What is chain of thought (CoT) prompting?]: https://www.ibm.com/think/topics/chain-of-thoughts
|
||||
|
||||
@@ -34,6 +34,7 @@ capabilities, and enterprise-scale LLM serving.
|
||||
```sh
|
||||
pip install 'vllm'
|
||||
pipx install 'vllm'
|
||||
uv tool install 'vllm' # 'vllm-metal' on apple silicon
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
77
knowledge base/uv.md
Normal file
77
knowledge base/uv.md
Normal file
@@ -0,0 +1,77 @@
|
||||
# uv
|
||||
|
||||
Fast Python package and project manager written in Rust.
|
||||
|
||||
1. [TL;DR](#tldr)
|
||||
1. [Further readings](#further-readings)
|
||||
1. [Sources](#sources)
|
||||
|
||||
## TL;DR
|
||||
|
||||
<details>
|
||||
<summary>Setup</summary>
|
||||
|
||||
```sh
|
||||
brew install 'uv'
|
||||
docker pull 'ghcr.io/astral-sh/uv:0.10.4-python3.12-trixie'
|
||||
pip install 'uv'
|
||||
pipx install 'uv'
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>Usage</summary>
|
||||
|
||||
```sh
|
||||
# Install applications.
|
||||
# Similar to `pipx install`.
|
||||
uv tool install 'ansible'
|
||||
|
||||
# List installed applications.
|
||||
uv tool list
|
||||
|
||||
# Run applications.
|
||||
# Similar to `pipx run`.
|
||||
uv tool run 'vllm'
|
||||
uvx 'vllm' # alias for `uv tool run`
|
||||
|
||||
# Clear the cache.
|
||||
uv cache clean
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<!-- Uncomment if used
|
||||
<details>
|
||||
<summary>Real world use cases</summary>
|
||||
|
||||
```sh
|
||||
```
|
||||
|
||||
</details>
|
||||
-->
|
||||
|
||||
## Further readings
|
||||
|
||||
- [Website]
|
||||
- [Codebase]
|
||||
|
||||
### Sources
|
||||
|
||||
- [Documentation]
|
||||
|
||||
<!--
|
||||
Reference
|
||||
═╬═Time══
|
||||
-->
|
||||
|
||||
<!-- In-article sections -->
|
||||
<!-- Knowledge base -->
|
||||
<!-- Files -->
|
||||
<!-- Upstream -->
|
||||
[Codebase]: https://github.com/astral-sh/uv
|
||||
[Documentation]: https://docs.astral.sh/uv/
|
||||
[Website]: https://docs.astral.sh/uv/
|
||||
|
||||
<!-- Others -->
|
||||
Reference in New Issue
Block a user