Why Data Separation Matters — And Why Sabiki Refuses to Compromise

In today’s cloud-first, AI-driven world, enterprises entrust their security vendors with something invaluable: their data. At Sabiki, we believe that responsibility comes with a simple but uncompromising principle: your data is yours, and it stays that way.

Unfortunately, not every vendor shares this philosophy. Just recently, we’ve seen headlines where security providers openly admit to using customer data to train their AI models. While it may accelerate their product roadmap, it comes at the cost of customer trust — and ultimately, customer security.

The Critical Role of Data Separation

When we talk about data separation, we mean more than just keeping databases partitioned. True separation requires:

Isolation by customer — ensuring one customer’s data cannot be accessed, even accidentally, by another.
Isolation by region — so customers can keep their data in the jurisdiction of their choice, supporting compliance with regulations like GDPR, HIPAA, or regional data residency laws.
Isolation across the stack — from infrastructure and backend services to APIs and front-end experiences.

This is not just an architectural choice; it is a security mandate. Breaches and cross-contamination of data often come from weak separations at the infrastructure or application layer. By engineering for strict separation, we drastically reduce those risks.

Sabiki’s Commitment: Security First, Convenience Second

We’ve recently completed a major redesign of our backend to ensure full-stack data separation. Every customer’s environment can now be deployed not only in the cloud region of their choice, but also with dedicated backend resources fully segregated from other customers.

This means whether you’re a global enterprise with strict data residency requirements, or an organization demanding private cloud deployment for the highest level of assurance, Sabiki can adapt — without compromise.

The AI Dilemma: Why We Don’t Train on Your Data

There’s no denying that customer data is the lifeblood of modern AI systems. Training on vast, real-world datasets makes AI models smarter. But when the dataset is your sensitive emails, files, and business communications, the cost of sharing it is simply too high.

For us, the math is simple:

Short-term gain for us ≠ worth long-term risk for you.
We will not trade customer trust for model accuracy.
Our AI models are designed to work effectively without ever mining your private data.

Yes, this makes our job harder. Yes, it slows down some aspects of model development. But security means putting customer interests first, always.

Learning From Others’ Mistakes

When providers admit to using customer data to train their AI systems, it should raise alarms. Even if anonymization or aggregation is claimed, history shows that “anonymized” datasets can often be deanonymized. Beyond the technical risks, it undermines the very trust customers place in a security partner.

At Sabiki, we refuse to take shortcuts with your data. We believe in building AI responsibly, through methods that respect your privacy, ensure compliance, and protect the confidentiality of your business.

Using Customer Data for AI Training Breaches Core Security Principles

When security vendors decide to use customer data to train their AI models, they are not just making a product choice — they are undermining foundational principles enshrined in global security frameworks. At a high level, such practices conflict with:

ISO/IEC 27001 (Information Security Management Systems) — which requires organizations to implement controls for confidentiality and restrict use of customer data to only what is explicitly agreed upon. Training AI on customer data without clear, informed consent goes against these obligations.
SOC 2 (Trust Services Criteria) — particularly the Confidentiality and Privacy principles, which emphasize that customer data should only be used for its intended purpose, with strict access and handling controls. Repurposing data for internal AI training is a violation of these criteria.
GDPR (General Data Protection Regulation) and similar data protection laws (such as CCPA in California) — which establish strict limits on secondary use of personal data. Using customer communications to train models would typically exceed the scope of lawful processing.
NIST Cybersecurity Framework (CSF) — which highlights the importance of data governance, protection, and accountability. Leveraging customer data for internal model training undermines accountability and data stewardship.
Cloud Security Alliance (CSA) Cloud Controls Matrix — which reinforces the requirement for data ownership, segregation, and purpose limitation. Vendors are expected to keep customer data isolated and to only use it in ways that customers have explicitly authorized.

In other words: if a vendor is feeding customer data into their AI pipeline, they are running counter to the very standards most enterprises rely on to measure cloud and SaaS vendor security.

Trust is the True Product

In cybersecurity, tools and features matter — but trust is the real product. Our redesigned backend, region-specific deployments, and commitment to never using your data for AI training are all part of one mission: to earn and safeguard that trust.

At Sabiki, we don’t just secure your email. We secure your confidence.

Developed by Email Security Professionals and Data scientists with decades of experience to make life easier for customers and MSPs alike, Sabiki Email Security is a cloud-native 'built-for Microsoft 365' SaaS solution that protects your organization from Phishing, Spam and targeted scams using the power of a dynamic AI feedback loop engine. Powered by a 'Dynamic' Machine Learning engine in combination with next-generation contextual and behavioral analysis capabilities, Sabiki Email Security provides an incredible level of granularity in engine customization with seamless onboarding and operation.

Back to blog