Our Promise on AI Model Training and Data Usage

Model Evaluation

At Proofpoint, we are continuously improving our products by evaluating and deploying cutting-edge AI models, including large language models and other advanced machine learning techniques. Before any model is put into service—whether for our products or internal operations, and whether sourced externally or developed in-house—it undergoes a rigorous, multi-stage evaluation process, including:

  • Technical Assessment – Ensuring performance, reliability, and operational efficiency.
  • Security Review – Evaluating risks and implementing mitigation strategies.
  • Legal & Regulatory Compliance – Aligning with all relevant privacy laws, industry regulations, and company policies.
  • Business Alignment – Ensuring the model serves strategic objectives and enhances cybersecurity effectiveness.

Models that fail to meet these stringent criteria are not adopted. Additionally, Proofpoint maintains a firm commitment to data privacy, confidentiality, security, and sovereignty. All deployed AI models operate within robust privacy and security frameworks, ensuring data protection through rigorous access controls and encryption methods.

We guarantee that at no time is customer data sold to third parties during evaluation, deployment, or operation.

AI security is an ongoing effort, and our models are continuously monitored to adapt to evolving threats while maintaining the highest standards of privacy, fairness, and accountability. Proofpoint remains dedicated to protecting your organization with AI-driven cybersecurity solutions that prioritize security without compromising privacy.

Data Used for Training

Our AI systems are trained to detect and mitigate emerging threats with speed and precision. To do this effectively, they must learn from real-world examples of malicious content.

When a message is definitively identified as malicious — whether through user submission, threat research, or detection system analysis — it may be used to improve our detection capabilities. This includes training Nexus AI to recognize malicious patterns and adapt to evolving attacker behavior. The accuracy and safety of our customers depend on this feedback loop.

Use of Threat Analytics, Aggregated Intelligence and Small Random Samples

In addition to malicious message content, our models benefit significantly from aggregated threat analytics across our ecosystem. This includes metadata like sender infrastructure, delivery patterns, known-bad URLs or domains, campaign telemetry, and behavioral indicators.

This data is normalized, and aggregated — and it provides crucial context that allows us to adapt defenses in near real-time. It's how we stay ahead of advanced threats, and it’s foundational to the value our customers receive.

Clarifying LLM Use and Data Isolation

We do not use customer data to train any generative LLMs, such as those based on GPT or similar technologies.

Instead, the portions of Nexus AI that are trained on malicious samples leverage:

  • Classical machine learning models (e.g., classifiers, clustering algorithms), and
  • Non-generative natural language models for tasks like pattern recognition and entity detection.

These models do not generate text, and they do not retain or reproduce identifiable customer content. All model training is governed by strict data governance policies and is fully aligned with GDPR and other regional privacy standards.

Commitment to Privacy and Compliance

We take customer data protection seriously. All training processes adhere to GDPR, CCPA, the EU’s Data Act, and other regional regulations, and we operate under a clear data governance model.

© 2025 Proofpoint. All rights reserved. The content on this site is intended for informational purposes only.
Last updated May 30, 2025.

Proofpoint Trust

Proofpoint helps companies protect their people from the ever-evolving threats in the digital ecosystem.