PRIVACY & SECURITY

Privacy is the product.
Not a feature.

Most companies treat privacy as a compliance checkbox. At Boxy, privacy is the core technical challenge and our primary competitive advantage.

What privacy actually means

A lot of AI agents stress privacy by saying "your data is stored locally." But here's the problem: whenever inference or embedding is needed, they send everything to the cloud. Your messages, your contacts, your location — all of it leaves your device the moment the AI needs to think.

That's not privacy. That's a marketing claim with a backdoor.

We call our local processing environment the Box — your personal, on-device privacy layer where all obfuscation, NER extraction, and embedding happen. Everything that can be processed locally stays inside the Box. Nothing leaves until it's been stripped of anything that could identify you.

Other agents

Data stored locally — but sent to cloud for inference

Embeddings generated in the cloud with raw text

PII exposed during every API call to LLM providers

"Local-first" is only for storage, not processing

Boxy

Data stored, processed, and obfuscated locally

Embeddings generated on-device before cloud transmission

Cloud only sees anonymized tokens — never raw data

Local-first for storage, processing, AND inference prep

Local obfuscation

Before any data leaves your device, our on-device NER engine identifies and replaces personally identifiable information with anonymous tokens. Non-sensitive content passes through untouched — the cloud sees the conversation's intent but can never link it back to a real person.

Raw input (your device)

After obfuscation (your device)

1from: john@gmail.com

2msg: "Hey, let's open a company in SF."

3to: sarah@work.com

4attachment: "business_plan_v3.pdf"

5timestamp: 2026-02-19T14:23:01Z

6location: "San Francisco, CA"

1from: [USER_A_EMAIL]

2msg: "Hey, let's open a company in SF."

3to: [USER_B_EMAIL]

4attachment: [FILE_HASH_7a3f]

5timestamp: 2026-02-19T14:23:01Z

6location: "San Francisco, CA"

Obfuscation layer v3.1 — runs inside the Box

Inside the Box

Local embedding

Beyond obfuscation, we also generate embeddings locally on your laptop. This means we understand the semantic meaning of your data without any raw text ever reaching a remote server. The cloud receives only anonymized vectors — mathematical representations that cannot be reverse-engineered back to your original text.

How data flows through Boxy

Every step is designed to minimize data exposure. The first four steps happen entirely on your device.

THE BOX

01.Data Capture

GUI agents extract context from your platforms

Runs in sandboxed environment. No persistent storage.

The Box

02.Local NER

On-device entity recognition identifies all PII

Never leaves your machine. <2ms processing.

The Box

03.Obfuscation

All PII replaced with anonymous tokens

Deterministic masking. Reversible only by you.

The Box

04.Local Embedding

Intent encoded as privacy-safe vectors

Vectors cannot be reverse-engineered to text.

The Box

05.Cloud Inference

Model processes only anonymized context

Zero raw data exposure. Obfuscated inputs only.

Cloud

06.Proposal Generation

Actionable proposal generated and delivered

De-anonymized locally. Never stored in cloud.

The Box

Result: Your identity never leaves the box. Zero PII in transit. Zero PII in cloud storage. Full functionality preserved.

Security through agent design

Beyond data privacy, we've taken an opinionated approach to agent security by abandoning the general-purpose agent design entirely.

Instead of one monolithic agent that promises to do everything, we adopt a Unix-inspired design: thousands of small agents, each doing one thing and doing it well.

This design allows us to require each agent to have a capability manifest — a declaration of exactly what skills it can use and what personal data it can access. No agent ever gets granted access to do things it wasn't designed to do.

capability_manifest.json — Email Draft Agent

{
  "agent_id": "email_draft_v2",
  "skills": [
    "draft_email",
    "read_contacts"
  ],
  "data_access": [
    "email_threads",
    "contact_list"
  ],
  "restricted": [
    "financial_data",
    "health_records",
    "social_media_dms",
    "browser_history"
  ]
}

This protects users from agent malfunction — if an email-drafting agent is compromised, it cannot access your financial data or social media messages. The blast radius is contained by design.

Core principles

Zero-knowledge architecture

Our inference pipeline never sees raw data. Only anonymized, intent-bearing tokens reach the cloud.

On-device processing

NER extraction, PII detection, and embedding generation all happen on your hardware.

No surveillance capitalism

We don't sell your data. We don't train on it. We can't even see it — our architecture makes it technically impossible for us to access your raw data, even if we wanted to.

Encrypted at rest & in transit

AES-256 encryption for stored data. TLS 1.3 for all network communication.

Full audit trail

Every data access and inference request logged in an immutable, encrypted audit trail.

Capability manifests

Every agent declares what it can access. No exceptions. No scope creep.

Privacy is the product.Not a feature.

What privacy actually means

Local obfuscation

Local embedding

How data flows through Boxy

Security through agent design

Core principles

Zero-knowledge architecture

On-device processing

No surveillance capitalism

Encrypted at rest & in transit

Full audit trail

Capability manifests

Privacy is the product.
Not a feature.