mirror of
https://gitea.com/mcereda/oam.git
synced 2026-02-23 03:54:28 +00:00
chore(kb/ai): review and expand notes
This commit is contained in:
@@ -24,6 +24,9 @@ They use [LLMs][large language models] to comprehend user inputs, deconstruct an
|
||||
determine when to call on external tools to obtain up-to-date information, optimize workflows, and autonomously create
|
||||
subtasks to achieve complex goals.
|
||||
|
||||
LLMs find it difficult, if not impossible, to distinguishing data from instructions.<br/>
|
||||
Every part of the data could be used for prompt injection, and lead the agent astray.
|
||||
|
||||
Traditional software is _deterministic_, AI is _probabilistic_.
|
||||
|
||||
Reliability and delays accumulate fast, bringing down the probability of success for each step an agent needs to
|
||||
@@ -107,6 +110,17 @@ Models can be tricked into taking actions they usually would not do.
|
||||
|
||||
### Prompt injection
|
||||
|
||||
AI agents use [LLMs][large language models] to comprehend user inputs, deconstruct and respond to requests step-by-step,
|
||||
determine when to call on external tools to obtain up-to-date information, optimize workflows, and autonomously create
|
||||
subtasks to achieve complex goals.
|
||||
|
||||
LLMs find it difficult, if not impossible, to distinguishing data from instructions.<br/>
|
||||
Every part of the data could be used for prompt injection, and lead the agent astray.
|
||||
|
||||
The tool itself is not that big of a deal, but due to it integrating with services, it requires to have access to keys
|
||||
and commands.<br/>
|
||||
The LLMs that it uses are mostly not secure enough to be trusted with this kind of access due to the reasons above
|
||||
|
||||
Badly programmed agents could analyze file and take some of their content as instructions.<br/>
|
||||
If those contain malevolent instructions, the agent could go awry.
|
||||
|
||||
@@ -128,6 +142,7 @@ See [An AI Agent Published a Hit Piece on Me] by Scott Shambaugh.
|
||||
- [ASCII Smuggler Tool: Crafting Invisible Text and Decoding Hidden Codes]
|
||||
- [Superpowers: How I'm using coding agents in October 2025], and [obra/superpowers] by extension
|
||||
- [OpenClaw][openclaw/openclaw], [OpenClaw: Who are you?] and [How a Single Email Turned My ClawdBot Into a Data Leak]
|
||||
- [nullclaw/nullclaw], [OpenClaw][openclaw/openclaw] alternative
|
||||
- [Claude Code]
|
||||
- [Gemini CLI]
|
||||
- [OpenCode]
|
||||
@@ -141,6 +156,7 @@ See [An AI Agent Published a Hit Piece on Me] by Scott Shambaugh.
|
||||
- [39C3 - Agentic ProbLLMs: Exploiting AI Computer-Use and Coding Agents]
|
||||
- [xAI engineer fired for leaking secret "Human Emulator" project]
|
||||
- IBM's [The 2026 Guide to AI Agents]
|
||||
- [moltbot security situation is insane]
|
||||
|
||||
<!--
|
||||
Reference
|
||||
@@ -162,6 +178,8 @@ See [An AI Agent Published a Hit Piece on Me] by Scott Shambaugh.
|
||||
[An AI Agent Published a Hit Piece on Me]: https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me/
|
||||
[ASCII Smuggler Tool: Crafting Invisible Text and Decoding Hidden Codes]: https://embracethered.com/blog/posts/2024/hiding-and-finding-text-with-unicode-tags/
|
||||
[How a Single Email Turned My ClawdBot Into a Data Leak]: https://medium.com/@peltomakiw/how-a-single-email-turned-my-clawdbot-into-a-data-leak-1058792e783a
|
||||
[moltbot security situation is insane]: https://www.youtube.com/watch?v=kSno1-xOjwI
|
||||
[nullclaw/nullclaw]: https://github.com/nullclaw/nullclaw
|
||||
[obra/superpowers]: https://github.com/obra/superpowers
|
||||
[OpenClaw: Who are you?]: https://www.youtube.com/watch?v=hoeEclqW8Gs
|
||||
[openclaw/openclaw]: https://github.com/openclaw/openclaw
|
||||
|
||||
@@ -63,6 +63,9 @@ They can be also be further trained on additional information specific to an ind
|
||||
The capabilities of transformer-based LLMs depend from the amount and the quality of their training data.<br/>
|
||||
LLMs appear to be hitting a performance wall, and will probably need the rise of a different architecture.
|
||||
|
||||
LLMs find it difficult, if not impossible, to distinguishing data from instructions.<br/>
|
||||
As such, every part of the data could be used for prompt injection.
|
||||
|
||||
<!-- Uncomment if used
|
||||
<details>
|
||||
<summary>Setup</summary>
|
||||
|
||||
Reference in New Issue
Block a user