Artificial Intelligence for Today’s Financial Services Industry Archive Compliance

Artificial Intelligence for Today’s Financial Services Industry Archive Compliance

Share with your network!

In 2019, an estimated 293.2 billion business emails will be sent each day.[1]

In 2018-2019, nearly 1/3 of Chief Marketing Officers’ budgets will be allocated to marketing technology,[2] multiplying the diversity of data stored in enterprise archive systems.


We have 250,000 documents to review in less than a week!

This is an all-too-familiar scenario for financial services organizations, as well as many other firms, when responding to e-discovery requests from regulators or opposing counsel. The high volume of stored messages and the various data sources from which they are generated create major issues in responding to information requests related to potential compliance violations. These can include producing messages that do not pertain to the matter at hand and spending more money than necessary to cull data in an e-discovery solution.

Proofpoint’s Enterprise Archive addresses these problems by natively integrating analytics and artificial intelligence—Technology Assisted Review (TAR)—with archived data. TAR is the process of having computer software electronically classify documents as responsive to a matter based on input from expert reviewers.[3] The use of predictive coding to tag relevant and nonrelevant data by artificial intelligence is settled law[4] and a best practice for financial institutions storing high volumes of data.

So what exactly can analytics and TAR do for banks, insurance companies, and other financial services organizations managing an ever-growing archive? E-Discovery analytics can be used to highlight potential violations within a dataset’s language usage and communication patterns. If, for example, the SEC, FINRA, or another regulator requests information relating to communications surrounding certain financial transactions, analytics and TAR efficiently allow e-discovery reviewers to reduce the dataset of returned items and tag messages that are either responsive or non-responsive to the matter. TAR then analyzes these decisions and uses predictive coding to cull the remaining data set—sometimes millions of messages—prior to exporting to outside counsel or an e-discovery tool. Today, TAR’s tagging of messages for export is essential to efficiently producing tens (or hundreds) of thousands of messages.

Financial services firms archive millions of external and internal messages each day, including those by brokers, investment advisors, traders, loan officers, support reps, customers, and many others. How much money do these companies waste each year by manually reviewing all messages that are responsive to search terms? Or by delivering hundreds of thousands of nonrelevant messages to outside counsel or third-party vendors for review (averaging around $300 per hour for a junior attorney)?[5] Experts estimate that once all costs are considered, including storage, reviewing, culling, redacting, and production, e-discovery could cost firms up to $30,000 per gigabyte of electronically stored information.[6]  That's a lot of money, even for the biggest financial services firms.

Historically, enterprise archives were designed to retrieve and export all custodian and/or lexicon-based data for review. This made sense when financial institutions archived a fraction of today’s electronically stored information, and the vast majority of that data was from a small number of platforms, such as Exchange Emails and Bloomberg Messages. But today, soaring message volumes and the growing diversity of social platforms that financial institutions use to market products and services are causing archive vendors to scramble to integrate Early Case Assessment (ECA) functionality to cull the result sets, prior to exporting the data.

Proofpoint’s Enterprise Archive is ahead of the pack with Technology Assisted Review and e-discovery analytics.  It can cull hundreds of thousands of messages in a fraction of the time humans could process them—before exporting the data for review. While other archive vendors are in reactive mode trying to keep up with today’s requirements for artificial intelligence, Proofpoint’s financial services customers already have access to this cost-savings feature.  

At the same time, we are developing additional functionality to meet the future archive and e-discovery requirements of the financial services industry.

Banks, broker-dealers, financial advisors, insurance companies, and other financial institutions are facing strict compliance requirements for providing complete and proper data when responding to information requests by the SEC, FINRA, national, and state regulatory authorities. With the volume and variety of message data-types growing quickly, it is critical for companies to be able to swiftly identify the right information to retrieve from storage. This requires an archive system that integrates artificial intelligence functionality that can quickly and accurately tag appropriate messages for export, while removing others that are not relevant. Proofpoint's Early Case Assessment functionality—Technology Assisted Review—provides this critical service, while reducing the time and costs associated with your e-discovery efforts.


[1] The Radicati Group, Inc. Market Researcher. Email Statistics Report, 2017–2021.

[2] Chris Pemberton. November 5, 2018. Gartner, Inc. 8 Top Findings in Gartner CMO Spend Survey 2018-19..

[4] Rio Tinto PLC v. Vale S.A., No. 14-Civ-3042 (S.D.N.Y. Mar 2, 2015), “the case law has developed to the point that it is now black letter law that where the producing party wants to utilize TAR for document review, courts will permit it.”

[5] United States Attorney’s Office. USAO Attorney’s Fees Matrix – 2015–2019.

[6] David Degnan, Accounting for the Costs of Electronic Discovery, 12 Minn. J.L. Sci. & Tech. 151 (2011). Herbert L. Roitblat, Search & Information Retrieval Science, 8 SEDONA CONF. J. 192, 192 (Fall 2007).