A fundamental problem engineering leaders must contend with is the movement of data, especially large volumes of it. Data needs in an organization are ubiquitous—whether it’s moving data from a transactional database (aka online transaction processing or OLTP) to an analytical database (aka online analytical processing or OLAP) or having different microservices that need the same data from a database.
Solutions for the data movement problem can range from a script that runs nightly to pull all data changes from the OLTP database to the OLAP database (for a very simple system), to a very complex data pipeline with a host of systems (for a large enterprise software-as-a-service company).
More than a decade and a half ago, the data movement problem was solved by using a simple script. However, that approach is no longer viable for the volumes of data processed today. Nevertheless, the array of architectural solutions and products available in that space are mind-boggling.
In this blog post, we’re focusing on one architectural solution—event-driven architecture (EDA)—that we believe solves the data movement problem on a massive scale and handles many common challenges, too. Even though EDA might seem complex given the number of products you may need to use, the architecture itself is very simple at its core.
3 major components of event-driven architecture
When you need to move data from one place to another, say from Service A to Service B, how do you capture the initial data or changes in the data? In EDA, you use an event.
What exactly is an event? Here’s a Webster dictionary definition: “A thing that happens, that is of importance.” In the context of EDA, an event refers to something that caused data to change.
For example, say one of the fields in a user’s record changed—perhaps the user’s last name has been updated. The update is an event. And instead of only capturing the actual change to the data, we capture the event itself and store it somewhere.
An event is one of three key parts of event-driven architecture, which include:
- The producer, defined as the service or system that initiates the change or the one that produces the event.
- A consumer, defined as the service that needs to know about the change or that needs to consume the event.
- The event itself.
“Event storming” leads to better understanding among teams
EDA greatly simplifies the way we think about capturing data changes. What makes it easier and better is the system that produces the event could publish it right at the source or right in the code or application when the event occurs. So, from a technical standpoint, events can be published, just before or after you’re writing this change to the database (which you may still need or want to do).
A big advantage of this way of working is that when requirements are being defined, the product managers can define the events along with their technical counterparts. There is no handoff from the product team to the technical team, which ends up defining the database schema. Both teams can work together to define the events. This process is called “event storming.”
Since we started our event-driven architecture journey at Proofpoint, we’ve gone through a few event storming sessions. Initially, these sessions were a bit bumpy, but more recent sessions have been a big success. Everyone is speaking the same language, and there is no information loss when going from product management to engineering. Product managers can see the information exactly as they want to capture and store it.
This benefit alone, in my opinion, is reason enough to use EDA. How many times have engineers faced a situation where requirements from the product team get misinterpreted when a corresponding database schema was built for a feature?
Asynchronous communication makes EDA shine
Another major advantage of EDA is that it’s asynchronous. What does that mean and why does it matter? An email vs. phone analogy can help to illustrate.
With the phone, you can only call one person at a time. You can probably conference in more people, but it’s all in real time, and the other person needs to be available at the exact moment to take the call and receive the information.
So, it’s synchronous—and a scheduled script could be an example of synchronous communication. (You can leave a voicemail, of course, but please humor me and stick with the analogy for now.) This type of communication is exactly the problem with synchronous calls, as the expectation is that both services, the source from which data is being pulled and the destination service to which we’re writing to, both need to be online and available. If the receiving service isn’t available, the data can’t be sent or passed.
Now, let’s talk about email. The top benefits of email over a phone call are that you can send information whenever you want, and the receiver (or consumer) can get the information and consume it whenever they want. Also, you can send information to as many people as you want in a single message, even to an entire company.
Imagine a CEO having to call every employee to impart an important message instead of just sending an email to them all at once. The amount of time the CEO spends writing and sending the message is a fraction of the phone call time, especially as the company gets bigger and bigger. Another advantage here is that employees can simply ignore the message if the information isn’t relevant to them.
The asynchronous way of communication is why EDA shines. A producer publishes the event, and any number of consumers can listen to these events. (Some upper limits may exist as to the number of consumers, depending on the type of system and use cases.)
Also, consumers can choose to pull the events when they need them or set up a “push notification” that alerts the consumer when a relevant event happens. Going back to our email analogy, we can set up our email system so that we only see new email messages when we log in. We can pull the new messages, or we can set up notifications, so we’re alerted as soon as a new email arrives in our inbox.
A fundamental change—and a forward-looking investment
Our EDA journey at Proofpoint has been long but exciting. It was also an essential move so we could scale and grow, and make sure we can meet the near-real time data expectations of our customers.
EDA requires many different teams to work together. Also, buy-in from leadership is needed to make such fundamental architecture changes. We were extremely lucky to have both, as our leadership understood the long-term impact and benefits of building such a solid foundation for our system.
Join the team
At Proofpoint, our people – and the diversity of their lived experiences and backgrounds– are the driving force behind our success. We have a passion for protecting people, data, and brands from today's advanced threats and compliance risks. We hire the best people in the business to:
- Build and enhance our proven security platform
- Blend innovation and speed in a constantly evolving cloud architecture
- Analyze new threats and offer deep insight through data-driven intel
- Collaborate with customers to help solve their toughest security challenges
If you're interested in learning more about career opportunities at Proofpoint, visit here: https://www.proofpoint.com/au/company/careers
About the author:
Vaish Krishnamurthy is a Senior Engineering Director at Proofpoint. In 2015, she founded her start-up CleanRobotics, where she and a few engineers created an automated smart trashcan that separates recyclables from landfill waste. She was named by Pittsburgh Business Times as a “Fast Tracker” (40 Under 40) in 2015. Vaish holds a Master’s degree in Electrical Engineering from the University of California, Riverside. She is a strong advocate for increasing the number of women in Computer Science.