WhatsApp's AI System Passes Security Audit, But With Asterisks

WhatsApp’s AI System Passes Security Audit, But With Asterisks

NCC Group has completed a months-long audit of WhatsApp's AI-based Message Summarization Service, affirming the system's strong privacy posture while warning of critical dependencies and residual attack surfaces.

Commissioned by Meta in January 2025, the audit scrutinized the architecture behind “Private Processing,” WhatsApp's encrypted AI infrastructure, which enables features like message summarization without compromising end-to-end encryption.

The review was conducted remotely by NCC Group's Cryptography, Hardware Security, and AI/ML Security teams, totaling 115 person-days. Their findings reveal a technically sophisticated system that leverages Trusted Execution Environments (TEEs), Remote Attestation TLS (RA-TLS), and Oblivious HTTP (OHTTP) protocols to ensure that Meta, as well as insider threats, cannot access message content during processing.

WhatsApp, used by over two billion users globally, introduced the summarization feature earlier this year as part of its broader “Private Processing” initiative. The system is engineered to enable advanced AI features without compromising its end-to-end encryption model, a significant challenge given that AI typically requires server-side data access. WhatsApp's solution, based on TEEs and advanced encryption schemes like RA-TLS and HPKE, aims to reconcile that conflict by isolating and attesting workloads that handle decrypted message data.

Meta's AI summarization feature allows WhatsApp users to send a batch of messages (such as unread group chats) to Meta-operated Large Language Models (LLMs) for summary generation. To preserve user privacy, this data is processed inside AMD SEV-SNP-powered Confidential Virtual Machines (CVMs), isolated from the broader Meta infrastructure. NVIDIA GPUs operating in Confidential Computing mode support the LLM workloads. Third-party infrastructure is also integral. Fastly acts as an Oblivious Relay, and Cloudflare manages transparency logs, ensuring trust anchors and cryptographic artifacts are publicly auditable.

NCC's evaluation focused on Meta's seven stated security assurances, including user control, data non-persistence, non-targetability, and verifiable transparency. Most of these goals were met through advanced cryptographic techniques and architecture-level separation. Importantly, the summarization feature is fully opt-in, requiring explicit user initiation. Client-side enforcement ensures that no data leaves the device without user consent.

However, the report highlighted a few previously unaddressed risks, three of which stood out:

Network Interface Misconfiguration: CVMs were found to initialize unnecessary hypervisor-assigned network interfaces, potentially enabling covert data exfiltration.
Stale Attestation Proofs: Missing freshness checks on transparency logs could allow attackers to run outdated, vulnerable images indefinitely.
Key Distribution Trust Gap: WhatsApp clients obtained Hybrid Public Key Encryption (HPKE) configurations directly from Meta, meaning Meta could, in theory, selectively serve malicious keys—a violation of non-targetability assumptions.

All three issues were resolved during the audit. In total, 16 of 21 findings were marked fixed, with one low-risk issue pending and four others formally risk-accepted by Meta.

All the issues NCC's audit uncovered
**NCC Group**

The audit also confirmed the system's use of Meta's own open-weight Llama 3 8B Instruct model for summarization, with a separate internal model handling content filtering. While NCC noted that prompt injection remains theoretically possible, the scope of damage is limited to misleading summaries, not broader system compromise. Meta mitigates these risks using a layered defense model and by enforcing runtime attestation of all critical components.

Despite the favorable review, NCC emphasized ongoing concerns about transparency and third-party trust. Fully verifiable transparency, such as open-sourcing all service binaries, remains incomplete. Moreover, the architecture's reliance on Meta-controlled signing keys and proprietary TEE firmware from AMD and NVIDIA introduces trust dependencies that cannot be fully audited.

NCC's report affirms that WhatsApp's summarization system is secure, but not without caveats. Users seeking maximum privacy should exercise caution and consider disabling these features entirely for sensitive conversations.

If you liked this article, be sure to follow us on X/Twitter and also LinkedIn for more exclusive content.