mirror of
https://gitea.com/mcereda/oam.git
synced 2026-02-09 05:44:23 +00:00
chore(kb): start article about ai agents
This commit is contained in:
108
knowledge base/ai/agent.md
Normal file
108
knowledge base/ai/agent.md
Normal file
@@ -0,0 +1,108 @@
|
|||||||
|
# AI agent
|
||||||
|
|
||||||
|
> [!caution]
|
||||||
|
> TODO
|
||||||
|
|
||||||
|
AI-enabled system that proposes to complete tasks of various complexity levels on their own, possibly **without** the
|
||||||
|
need to stop to ask permission or consent to the user.
|
||||||
|
|
||||||
|
1. [TL;DR](#tldr)
|
||||||
|
1. [Concerns](#concerns)
|
||||||
|
1. [How much context is too much?](#how-much-context-is-too-much)
|
||||||
|
1. [Security](#security)
|
||||||
|
1. [Prompt injection](#prompt-injection)
|
||||||
|
1. [Further readings](#further-readings)
|
||||||
|
1. [Sources](#sources)
|
||||||
|
|
||||||
|
## TL;DR
|
||||||
|
|
||||||
|
Traditional software is _deterministic_, AI is _probabilistic_.
|
||||||
|
|
||||||
|
Reliability and delays accumulate fast, bringing down the probability of success for each step an agent needs to
|
||||||
|
take.<br/>
|
||||||
|
E.g., consider an agent that is 95% accurate per step; any 30-steps tasks it does is going to be successful only about
|
||||||
|
21% of the times (0.95^30).
|
||||||
|
|
||||||
|
Agents require _some_ level of context to be able to execute their tasks.<br/>
|
||||||
|
They should be allowed to access only the data they need, and users should _decide_ and _knowingly take action_ to
|
||||||
|
enable the agents that **they** want to be active.<br/>
|
||||||
|
Opt-**out** should be the default.
|
||||||
|
|
||||||
|
Prefer using **local** agents.
|
||||||
|
|
||||||
|
Consider limiting agent execution to containers or otherwise isolated environments, with only (limited) access to what
|
||||||
|
they absolutely need.
|
||||||
|
|
||||||
|
Prefer **requiring** consent by agents when running them.
|
||||||
|
|
||||||
|
## Concerns
|
||||||
|
|
||||||
|
Agents created by Anthropic and other companies have a history of not caring about agent abuse, and leave users on
|
||||||
|
their own while hiding behind a disclaimer.
|
||||||
|
|
||||||
|
### How much context is too much?
|
||||||
|
|
||||||
|
Integrating agents directly into operating systems and applications transforms them from relatively neutral resource
|
||||||
|
managers into active, goal-oriented infrastructure that is ultimately controlled by the companies that develop these
|
||||||
|
systems, not by users or application developers.
|
||||||
|
|
||||||
|
Systems integrated at that level are marketed as productivity enhancers, but can they function as OS-level surveillance
|
||||||
|
and create significant privacy vulnerabilities.<br/>
|
||||||
|
They also fundamentally undermines personal agency, replacing individual choice and discovery with automated, opaque
|
||||||
|
recommendations that can obscure commercial interests and erode individual autonomy.
|
||||||
|
|
||||||
|
Microsoft's _Recall_ creates a comprehensive _photographic memory_ of all user activity, functionally acting as a
|
||||||
|
stranger watching one's activity from one's shoulder.
|
||||||
|
|
||||||
|
Wide-access agents like those end up being centralized, high-value targets for attackers, and pose an existential
|
||||||
|
threat to the privacy guarantees of meticulously engineered privacy-oriented applications.<br/>
|
||||||
|
Consider how easy Recall has been hacked (i.e., see _[TotalRecall]_).
|
||||||
|
|
||||||
|
### Security
|
||||||
|
|
||||||
|
Even if the data collected by a system is secured in some way, making it available to malevolent agents will allow them
|
||||||
|
to exfiltrate it or use it for evil.<br/>
|
||||||
|
This becomes extremely worrisome when agents are **not** managed by the user, and can be added, started, or even
|
||||||
|
created by other agents.
|
||||||
|
|
||||||
|
Many agents are configured by default to automatically approve requests.<br/>
|
||||||
|
This also allows them to create, make changes, and save files on the host they are running.
|
||||||
|
|
||||||
|
Models can be tricked into taking actions they usually would not do.
|
||||||
|
|
||||||
|
### Prompt injection
|
||||||
|
|
||||||
|
Badly programmed agents could analyze file and take some of their content as instructions.<br/>
|
||||||
|
If those contain malevolent instructions, the agent could go awry.
|
||||||
|
|
||||||
|
Instructions could also be encoded into unicode characters to appear as harmless text.<br/>
|
||||||
|
See [ASCII Smuggler Tool: Crafting Invisible Text and Decoding Hidden Codes].
|
||||||
|
|
||||||
|
It also happened that agents modified each other's settings files, helping one another escaping their respective boxes.
|
||||||
|
|
||||||
|
## Further readings
|
||||||
|
|
||||||
|
- [TotalRecall]
|
||||||
|
- [Stealing everything you've ever typed or viewed on your own Windows PC is now possible with two lines of code — inside the Copilot+ Recall disaster.]
|
||||||
|
- [Trust No AI: Prompt Injection Along The CIA Security Triad]
|
||||||
|
- [Agentic ProbLLMs - The Month of AI Bugs]
|
||||||
|
- [ASCII Smuggler Tool: Crafting Invisible Text and Decoding Hidden Codes]
|
||||||
|
|
||||||
|
### Sources
|
||||||
|
|
||||||
|
- [39C3 - AI Agent, AI Spy]
|
||||||
|
- [39C3 - Agentic ProbLLMs: Exploiting AI Computer-Use and Coding Agents]
|
||||||
|
|
||||||
|
<!--
|
||||||
|
Reference
|
||||||
|
═╬═Time══
|
||||||
|
-->
|
||||||
|
|
||||||
|
<!-- Others -->
|
||||||
|
[39C3 - Agentic ProbLLMs: Exploiting AI Computer-Use and Coding Agents]: https://www.youtube.com/watch?v=8pbz5y7_WkM
|
||||||
|
[39C3 - AI Agent, AI Spy]: https://www.youtube.com/watch?v=0ANECpNdt-4
|
||||||
|
[Agentic ProbLLMs - The Month of AI Bugs]: https://monthofaibugs.com/
|
||||||
|
[ASCII Smuggler Tool: Crafting Invisible Text and Decoding Hidden Codes]: https://embracethered.com/blog/posts/2024/hiding-and-finding-text-with-unicode-tags/
|
||||||
|
[Stealing everything you've ever typed or viewed on your own Windows PC is now possible with two lines of code — inside the Copilot+ Recall disaster.]: https://doublepulsar.com/recall-stealing-everything-youve-ever-typed-or-viewed-on-your-own-windows-pc-is-now-possible-da3e12e9465e
|
||||||
|
[TotalRecall]: https://github.com/xaitax/TotalRecall
|
||||||
|
[Trust No AI: Prompt Injection Along The CIA Security Triad]: https://arxiv.org/pdf/2412.06090
|
||||||
Reference in New Issue
Block a user