Nyc clouds

Using Multimodel Endpoints for Scalable User Behaviour Profiling and Anomaly Detection with SageMaker

Share with your network!

Engineering Insights is an ongoing blog series that gives a behind-the-scenes look into the technical challenges, lessons and advances that help our customers protect people and defend data every day. Our engineers write each post, explaining the process that led up to a Proofpoint innovation. 

Organisations face constant threats from insider activities. This makes it crucial to identify when user behaviours deviate from normal patterns. Another critical use case is preventing sensitive data from leaving an organisation undetected.

At Proofpoint, we have designed a sophisticated anomaly detection pipeline. Not only does it model individual user behaviour profiles, but it also aggregates tenant-specific insights. This enhances its accuracy in detecting and preventing security threats.

There are unique challenges when it comes to detecting anomalous behaviour in a multi-tenant system. We must preserve data isolation while also ensuring scalability. The model must also handle data sparsity as well as perform complex feature engineering to ensure that the most statistically significant user activities are considered.

We have successfully used multimodel endpoints (MME) to operate efficiently at scale. Here’s how.

User behaviour profiling

To detect anomalous activities, we first compute comprehensive user behaviour profiles (UBP). Each UBP captures distinct behavioural patterns for individual users across the whole tenant. We do this for every tenant completely separately from each other, adhering to strict data compliance policies.

This profile encapsulates key statistical metrics such as modified standard deviation, percentile values and other derived metrics. In this way, we effectively characterise a user's activity distribution over a period of several days.

We determined that user activities along the direction of least variance are often the most critical indicators of anomalous behaviour. By prioritising low-variance features, it’s possible to identify behavioural signals that are more meaningful, thereby improving the model's robustness to noise.

The traditional modelling approach recommends dropping low variance features as these can be less informative and prone to noise. However, for modelling activity behaviour, low variance features can provide a normal behaviour pattern that can be helpful when it comes to identifying peer group clusters. And slight variations must not be ignored as noise. For example, a VPN proxy for office users from a certain location may be low variance. But a change in access source IP may potentially indicate an anomaly.

Data isolation

Data security and privacy were key considerations given that it’s a multi-tenant system. We ensured strict data segregation to prevent any information leaking between tenants. Each user's profile is constructed from data belonging to that specific user’s tenant. Further, each tenant has its own separate machine learning (ML) model. This design enforces robust data isolation.

Tenant profiles

We also computed tenant profiles that characterise overarching behavioural patterns across the customer’s environment. Tenant profiles identify key metrics, such as website usage patterns, activity trends and other tenant-wide attributes. We then combined user profiles with broader tenant patterns. And then we enhanced the system’s ability to distinguish legitimate deviations from false positives.

Inference

Our inference pipeline evaluates a user’s current activity within a defined time window. Both the user behaviour profile and the corresponding tenant profile are used to compute a dynamic anomaly score for the user's current activity pattern.

Activities that deviate significantly are flagged as anomalies. Anomalous events are then flagged and added to the data model along with explainability which is then finally rendered in the UI as shown in Figure 1 below.

Flagged anomalous activities on dashboard

Figure 1: Flagged anomalous activities.

To scale the system efficiently we used multimodel endpoints. Our inference pipeline is shown in Figure 2.

Pipeline architecture

Figure 2: Pipeline architecture.

MMEs allow multiple models to be hosted on a single endpoint. This enables dynamic, on-demand model loading and unloading. It’s an approach that significantly reduces infrastructure costs because it enables resource use to be optimised.

It also maintained low latency for real-time inference roundtrip based on our tests. Every machine learning (ML) model is stored in blob storage and dynamically loaded when required. This enables fast and efficient model switching. MMEs allowed us to manage several models under a single endpoint, streamlining our deployment strategy. This allowed us to scale seamlessly as the number of tenants increased.

Future work

In the future, we plan to use LLMs and AI agentic workflows to augment our anomaly detection pipeline. And large language models (LLMs) could summarise activities within specified time windows, automatically extracting key insights and anomalies. This would give security analysts more actionable intelligence. It would also enable faster and more accurate decision-making. What’s more, AI agents can work from the initial anomaly findings to execute workflows, like compiling real-time file content scanning results and data lineage information to remove false positives.

Join the team

At Proofpoint, our people—and the diversity of their lived experiences and backgrounds—are the driving force behind our success. We have a passion for protecting people, data and brands from today’s advanced threats and compliance risks.  

We hire the best people in the business to:  

  • Build and enhance our proven security platform  
  • Blend innovation and speed in a constantly evolving cloud architecture  
  • Analyse new threats and offer deep insight through data-driven intelligence  
  • Collaborate with our customers to help solve their toughest cybersecurity challenges  

If you’re interested in learning more about career opportunities at Proofpoint, visit the careers page.

About the authors

Ram Kulathumani

Ram Kulathumani is a senior manager at Proofpoint, leading ML research and development for impactful security use cases. Recently, his primary focus has been fine-tuning large transformer models, prompt engineering, evaluating LLMs and optimising ML inference performance.

Azeem Yousaf

Azeem Yousaf is a senior engineer at Proofpoint specialising in developing AI-driven threat intelligence solutions. His expertise spans core algorithm development and advanced feature engineering. Outside work, Azeem enjoys playing cricket at his local club.

Khurram Ghafoor

Khurram Ghafoor is a senior director at Proofpoint, specialising in data platform architecture and AI/ML pipelines. With over 20 years of experience, he brings expertise in software development, scalable architecture and the application of machine learning and AI technologies.