How to Measure Email Security Effectiveness the Right Way

Seeing the Full Picture: How to Measure Email Security Effectiveness the Right Way

Share with your network!

June 16, 2026 Ryan Kalember

Perhaps it’s an obvious question, but I think it’s worth both asking and answering: why would anyone believe a vendor-generated report that compares their solution’s effectiveness to others? Our answer is simple: you shouldn’t, but the data is still valuable. For critical cybersecurity challenges like email threat protection, comparative data is simple and easy to obtain, typically requiring little time and read only access to an API. For years, Proofpoint has recommended that every organization run a Proof of Value (POV) exercise before choosing an architecture and a product, and we continue to do so. We’re providing this data as an aggregation of what we’ve seen across hundreds of these analyses, as it provides a useful guide to what data points to compare and how to compare them.

Returning to the topic at hand, the 2026 Verizon DBIR showed that the presence of the human element increased to 62% of breaches this year. Email is still the number one way attackers reach people, and the effectiveness of your email security determines how much of that risk ever touches your users. As organizations standardize on cloud-native platforms like Microsoft 365, one question matters more than any benchmark chart: of every malicious message that targeted my people, how many were stopped before they reached the inbox? That is a different question than “of the messages that were delivered, what got cleaned up afterward?” — and the difference is the whole point.

At Proofpoint, we believe transparency is foundational, and we’ve operationalized it for years. We welcome the growing industry interest in publishing detection data; more measurement is good for customers and good for the ecosystem. But as more vendors share numbers, it’s worth being precise about what is actually being measured, where in the mail flow it is measured, and who deserves credit for the catch. Numbers without that context can mislead.

27.1%

of threats blocked by Proofpoint bypassed Microsoft (median, side-by-side)

408

advanced threats missed per 1,000 users, annualized behind Microsoft

41.7 min

median time for Microsoft to detect threats Proofpoint had already blocked

Median results across 696 production customer assessments. Source: Proofpoint aggregated proof-of-concept data, July 2025 – April 2026.

Where You Measure Changes What You See

To interpret any efficacy number, you first have to understand deployment architecture. In many Proofpoint deployments, our Secure Email Gateway (SEG) sits in front of Microsoft 365 or Google Workspace in SMTP mail flow. The organization’s MX record points at the gateway, which inspects, filters, and blocks malicious content before Microsoft Defender for Office or Google ever sees it. Anything stopped upstream never reaches Microsoft or Google, which means those threats are completely invisible to any measurement taken inside the email provider’s own stack.

This creates a structural blind spot in any benchmark that only counts what was left over after a gateway already did its job. Evaluating only the messages that survived upstream filtering is like reviewing only the shots that reached the goalkeeper without counting the ones the defenders already blocked. The most dangerous threats are often the ones intercepted earliest, and a leftover-only view never sees them. A fair comparison has to start from the full population of inbound threats, not the residue.

It’s also why deployment mode matters when reading vendor benchmarks. An API or post-delivery integration that sits behind Microsoft, Google, or another provider can only ever inspect mail that was already delivered; by design it sees a fraction of the threat volume a front-line gateway does, whether it’s Proofpoint’s API solution or any other. Comparing a behind-the-mailbox integration against a full inline gateway and concluding the added value is “nominal” conflates the deployment with the capability. The right one-to-one comparison puts each engine in front of the same inbound stream.

How Proofpoint Measures Efficacy — Both Ways

For more than two years, Proofpoint has delivered a transparent, rigorous methodology through our Efficacy Analysis. Rather than estimate, we correlate unique threat IDs between Proofpoint and the incumbent solution using real-world production data. That lets us answer two questions precisely: what did Proofpoint catch that bypassed the incumbent, and vice versa. Not speculation, not lab traffic but rather a true one-to-one comparison on identical threats in the wild.

Across our most recent aggregated assessment data spanning 696 production deployments (July 2025-April 2026), the median customer running Proofpoint in front of Microsoft Defender catches an additional 27.1% of threats. Behind Microsoft, customers see a median of 408 advanced threats delivered per 1,000 users. Microsoft’s Zero Hour Auto Purge (ZAP) would not have remediated these threats, as the median dwell time was 41.7 minutes in Microsoft prior to being condemned by Proofpoint. Of the advanced threats reaching the inbox behind Microsoft, 70.5% were phishing, 22.3% malware, 5.4% business email compromise (BEC), and 1.8% TOAD (telephone-oriented attack delivery).

This isn’t unique to one incumbent. Measured consistently in terms of advanced threats missed per 1,000 users, behind each vendor, over the trailing 365 days, the broader SEG market shows comparable exposure. We publish the full picture rather than a single favorable slice:

*Advanced threats detected by Proofpoint divided by users protected, per 1,000 users; trailing 365 days. Source: Proofpoint aggregated POV data.*
Secure Email Gateways	Advanced threats missed / 1,000 users
Google	505
Barracuda	494
Mimecast	455
Microsoft	408
Trend Micro	320
Cisco	238

We also measure ourselves where it’s least flattering: false positives and false negatives. Across our customer base Proofpoint sustains a real-world detection rate of 99.999%, with fewer than one false positive in every 30 million messages and roughly one false negative in every 5 million, far exceeding our published SLA. Those rates live in customer-facing dashboards week over week, not in a marketing SLA. Transparency means showing the misses too.

Post-Delivery Catch Is a Backstop — Not a Scoreboard

Catching a threat after it has already landed in a user’s inbox is valuable, and Proofpoint performs automated post-delivery remediation of retroactively convicted threats, just as other platforms do. But two cautions apply when post-delivery numbers are used to claim superiority.

First, post-delivery catch is a measure of what got through, not what was prevented. A high post-delivery remediation share means more malicious mail reached people in the first place and dwelled there (in our data, a median of nearly 42 minutes), long enough for a user to engage. Pre-delivery blocking is strictly better for the user, and the only way to contain threats targeted at their AI assistants like Copilot and Gemini; the goal is to make post-delivery cleanup rare, not to celebrate its volume.

Second, credit should follow the engine that made the catch. When a benchmark counts only its own post-delivery detections while excluding or under-counting the remediation performed by an integrated partner, the resulting share is an artifact of the accounting, not the protection. A proper analysis attributes each catch to whoever made it, on both sides.

Architectural Blind Spots: Threats That Are Never Inspected

Some threats don’t merely slip past native filters. They never traverse the detection stack at all, so they leave no detection footprint and appear in no benchmark. Three pathways account for most of this hidden risk:

1. Direct Send abuse

Microsoft 365’s Direct Send feature allows unauthenticated messages to be delivered to internal users, which was originally intended for printers and legacy apps. Attackers exploit it to spoof internal senders and bypass authentication so malicious mail appears trustworthy. We’ve observed campaigns during evaluations where these messages were delivered without being scored or logged, even after being flagged for authentication failure.

2. Direct delivery (tenant-to-tenant bypass)

Tenant-to-tenant delivery lets one Microsoft 365 tenant send directly to another without traversing the recipient’s email security stack. Many legitimate SaaS services rely on this path, but attackers can use it to deliver phishing or malware with no inspection by any inline defense. We believe this is solvable: all messages, including tenant-to-tenant, should be routed through the recipient’s security stack.

3. TDS evasion and the resurgence of malware

Threat actors increasingly deliver malware through trusted services: compromised websites, abused cloud storage, and time-delayed, environment-aware payloads engineered to defeat sandboxing and intelligence lookups. In late 2024, we observed malware overtake phishing as the top advanced-threat category delivered through Microsoft 365, driven heavily by these evasion techniques that exploit the very architecture meant to stop them. This trend has continued, but the volume of often GenAI-created phish kits has put credential attacks back on top in 2026.

None of these threats show up in a benchmark that only counts what a detection engine scored. Yet they are precisely the threats a layered, inline defense is designed to stop.

How to Run a Fair, Side-by-Side Evaluation

If measurement is going to drive purchasing decisions, the evaluation itself has to be fair. Whatever vendors you compare, insist on these principles. They protect you regardless of who comes out ahead:

Overlapping timeframes. Measure every solution over the exact same window, so threat volumes and campaign activity are identical across platforms. Different time ranges produce different threat populations and an apples-to-oranges result.
Exportable raw data. Vendors count differently, some per message, others per URL or attachment. Demand exportable datasets including sender, recipient, subject, time of delivery, time of detection, and message ID, so you can reconcile the counts yourself. Be wary of platforms that gate raw data behind a purchase or impose export limits that prevent real analysis.
Clarity between automated and manual actions. An efficacy claim should reflect what the system caught, not what a team cleaned up by hand behind the scenes. Ask each vendor to distinguish automated detections from manual condemnations by analysts or sales engineers, and to confirm POV results aren’t curated before you see them.
Complete visibility from day one. You should see missed, blocked, and remediated messages, not just summary totals, with full console access from the moment the evaluation begins, with no gating, time delays, or conditional access.

This is exactly the model behind Proofpoint’s Efficacy Analysis: it correlates unique threats across Proofpoint and the incumbent in real time, using live production data, with the underlying threat records available for inspection. Over eight quarters that approach has consistently shown the same result — roughly a quarter of the threats Proofpoint blocks bypassed Microsoft.

Proofpoint + Microsoft: More Secure Together

None of this means single-vendor or layered is the only answer for everyone — it means customers deserve the full picture before they decide. Microsoft Defender for Office offers real value, and the best-protected Microsoft 365 environments combine native controls with third-party protection that closes architectural gaps and delivers measurable results.

Proofpoint Core Email Protection is available as both a Secure Email Gateway and an API-based deployment, so organizations can match the model to their cloud strategy. Deployed as a SEG in front of Microsoft 365, Proofpoint delivers:

Pre-delivery blocking of 99.999% of threats, reducing spam and unwanted mail and improving user productivity
Full visibility into missed, blocked, and remediated messages, with real-time and historical false-positive / false-negative metrics
Advanced detection for BEC, credential phishing, ransomware, malware, and TOAD, powered by Nexus AI, behavioral analysis, and URL and attachment sandboxing
Automated post-delivery remediation for threats convicted after delivery
Integration with best-of-breed tools like CrowdStrike Falcon, SentinelOne, and Microsoft Defender for Endpoint for enhanced visibility and actionable insight

With Proofpoint in front of the tenant, threats are stopped before they reach Microsoft; Defender then adds inline and post-delivery controls for additional coverage. The two are complementary layers — and the customer gets to see, in their own data, exactly what each layer contributes.

The Bottom Line: Don’t Settle for Partial Visibility

Measuring only what arrives after upstream filtering, crediting post-delivery cleanup as if it were prevention, and omitting the threats that never traversed the stack at all combine to produce a number that may look reassuring but tells an incomplete story. Real protection is measured against the full population of threats targeting your people, both pre- and post-delivery, with the underlying data open to inspection.

That’s the standard we hold ourselves to, and the standard we encourage every customer to demand of every vendor, including us.

Ready to See the Full Picture in Your Environment?

Connect with your Proofpoint team to request an Email Security Efficacy Analysis tailored to your organization. It’s a fast, data-driven, fully transparent way to understand exactly what threats are targeting your users and how many are being stopped before they ever reach their target.

Collaboration Security

Data Security

AI Security

Our Platform

By Use Case

By Industry

Services

Learn

Threat Intelligence

Partners

Support