Privacy is the product.
Not a feature.
Most companies treat privacy as a compliance checkbox. At Boxy, privacy is the core technical challenge and our primary competitive advantage.
What privacy actually means
A lot of AI agents stress privacy by saying "your data is stored locally." But here's the problem: whenever inference or embedding is needed, they send everything to the cloud. Your messages, your contacts, your location — all of it leaves your device the moment the AI needs to think.
That's not privacy. That's a marketing claim with a backdoor.
We call our local processing environment the Box — your personal, on-device privacy layer where all obfuscation, NER extraction, and embedding happen. Everything that can be processed locally stays inside the Box. Nothing leaves until it's been stripped of anything that could identify you.
Local obfuscation
Before any data leaves your device, our on-device NER engine identifies and replaces personally identifiable information with anonymous tokens. Non-sensitive content passes through untouched — the cloud sees the conversation's intent but can never link it back to a real person.
Local embedding
Beyond obfuscation, we also generate embeddings locally on your laptop. This means we understand the semantic meaning of your data without any raw text ever reaching a remote server. The cloud receives only anonymized vectors — mathematical representations that cannot be reverse-engineered back to your original text.
How data flows through Boxy
Every step is designed to minimize data exposure. The first four steps happen entirely on your device.
Result: Your identity never leaves the box. Zero PII in transit. Zero PII in cloud storage. Full functionality preserved.
Security through agent design
Beyond data privacy, we've taken an opinionated approach to agent security by abandoning the general-purpose agent design entirely.
Instead of one monolithic agent that promises to do everything, we adopt a Unix-inspired design: thousands of small agents, each doing one thing and doing it well.
This design allows us to require each agent to have a capability manifest — a declaration of exactly what skills it can use and what personal data it can access. No agent ever gets granted access to do things it wasn't designed to do.
{
"agent_id": "email_draft_v2",
"skills": [
"draft_email",
"read_contacts"
],
"data_access": [
"email_threads",
"contact_list"
],
"restricted": [
"financial_data",
"health_records",
"social_media_dms",
"browser_history"
]
}This protects users from agent malfunction — if an email-drafting agent is compromised, it cannot access your financial data or social media messages. The blast radius is contained by design.
Read more about our Unix Philosophy approachCore principles
Zero-knowledge architecture
Our inference pipeline never sees raw data. Only anonymized, intent-bearing tokens reach the cloud.
On-device processing
NER extraction, PII detection, and embedding generation all happen on your hardware.
No surveillance capitalism
We don't sell your data. We don't train on it. We can't even see it — our architecture makes it technically impossible for us to access your raw data, even if we wanted to.
Encrypted at rest & in transit
AES-256 encryption for stored data. TLS 1.3 for all network communication.
Full audit trail
Every data access and inference request logged in an immutable, encrypted audit trail.
Capability manifests
Every agent declares what it can access. No exceptions. No scope creep.