PRIVACY & SECURITY

Privacy is the product.
Not a feature.

Most companies treat privacy as a compliance checkbox. At Boxy, privacy is the core technical challenge and our primary competitive advantage.

What privacy actually means

A lot of AI agents stress privacy by saying "your data is stored locally." But here's the problem: whenever inference or embedding is needed, they send everything to the cloud. Your messages, your contacts, your location — all of it leaves your device the moment the AI needs to think.

That's not privacy. That's a marketing claim with a backdoor.

We call our local processing environment the Box — your personal, on-device privacy layer where all obfuscation, NER extraction, and embedding happen. Everything that can be processed locally stays inside the Box. Nothing leaves until it's been stripped of anything that could identify you.

Other agents
Data stored locally — but sent to cloud for inference
Embeddings generated in the cloud with raw text
PII exposed during every API call to LLM providers
"Local-first" is only for storage, not processing
Boxy
Data stored, processed, and obfuscated locally
Embeddings generated on-device before cloud transmission
Cloud only sees anonymized tokens — never raw data
Local-first for storage, processing, AND inference prep

Local obfuscation

Before any data leaves your device, our on-device NER engine identifies and replaces personally identifiable information with anonymous tokens. Non-sensitive content passes through untouched — the cloud sees the conversation's intent but can never link it back to a real person.

Raw input (your device)
After obfuscation (your device)
1from: john@gmail.com
2msg: "Hey, let's open a company in SF."
3to: sarah@work.com
4attachment: "business_plan_v3.pdf"
5timestamp: 2026-02-19T14:23:01Z
6location: "San Francisco, CA"
1from: [USER_A_EMAIL]
2msg: "Hey, let's open a company in SF."
3to: [USER_B_EMAIL]
4attachment: [FILE_HASH_7a3f]
5timestamp: 2026-02-19T14:23:01Z
6location: "San Francisco, CA"
Obfuscation layer v3.1 — runs inside the Box
Inside the Box

Local embedding

Beyond obfuscation, we also generate embeddings locally on your laptop. This means we understand the semantic meaning of your data without any raw text ever reaching a remote server. The cloud receives only anonymized vectors — mathematical representations that cannot be reverse-engineered back to your original text.

How data flows through Boxy

Every step is designed to minimize data exposure. The first four steps happen entirely on your device.

THE BOX
01.Data Capture
GUI agents extract context from your platforms
Runs in sandboxed environment. No persistent storage.
The Box
02.Local NER
On-device entity recognition identifies all PII
Never leaves your machine. <2ms processing.
The Box
03.Obfuscation
All PII replaced with anonymous tokens
Deterministic masking. Reversible only by you.
The Box
04.Local Embedding
Intent encoded as privacy-safe vectors
Vectors cannot be reverse-engineered to text.
The Box
05.Cloud Inference
Model processes only anonymized context
Zero raw data exposure. Obfuscated inputs only.
Cloud
06.Proposal Generation
Actionable proposal generated and delivered
De-anonymized locally. Never stored in cloud.
The Box

Result: Your identity never leaves the box. Zero PII in transit. Zero PII in cloud storage. Full functionality preserved.

Security through agent design

Beyond data privacy, we've taken an opinionated approach to agent security by abandoning the general-purpose agent design entirely.

Instead of one monolithic agent that promises to do everything, we adopt a Unix-inspired design: thousands of small agents, each doing one thing and doing it well.

This design allows us to require each agent to have a capability manifest — a declaration of exactly what skills it can use and what personal data it can access. No agent ever gets granted access to do things it wasn't designed to do.

capability_manifest.json — Email Draft Agent
{
  "agent_id": "email_draft_v2",
  "skills": [
    "draft_email",
    "read_contacts"
  ],
  "data_access": [
    "email_threads",
    "contact_list"
  ],
  "restricted": [
    "financial_data",
    "health_records",
    "social_media_dms",
    "browser_history"
  ]
}

This protects users from agent malfunction — if an email-drafting agent is compromised, it cannot access your financial data or social media messages. The blast radius is contained by design.

Read more about our Unix Philosophy approach

Core principles

Zero-knowledge architecture

Our inference pipeline never sees raw data. Only anonymized, intent-bearing tokens reach the cloud.

On-device processing

NER extraction, PII detection, and embedding generation all happen on your hardware.

No surveillance capitalism

We don't sell your data. We don't train on it. We can't even see it — our architecture makes it technically impossible for us to access your raw data, even if we wanted to.

Encrypted at rest & in transit

AES-256 encryption for stored data. TLS 1.3 for all network communication.

Full audit trail

Every data access and inference request logged in an immutable, encrypted audit trail.

Capability manifests

Every agent declares what it can access. No exceptions. No scope creep.