What Is a Data Leak?

When sensitive data is disclosed to an unauthorised third party, it’s considered a “data leak” or “data disclosure”. The terms “data leak” and “data breach” are often used interchangeably, but a data leak does not require exploitation of a vulnerability. A data leak can simply be disclosure of data to a third party from poor security policies or storage misconfigurations.

Differences Between a Data Leak and a Data Breach

It might seem insignificant, but it’s important to understand the difference between a data leak and a data breach. Both can be costly and have critical consequences, but a data leak involves much more negligence than a data breach. Human error is a significant risk for organisations, and a data leak is often the result of insider threats, often unintentional but just as damaging as a data breach.

Data breaches are caused by unforeseen risks or unknown vulnerabilities in software, hardware or security infrastructure. An attacker must find the vulnerability and exploit it, which is why administrators must continually update outdated software and install security patches or updates immediately.

A data leak results in a data breach, but it does not require exploiting an unknown vulnerability. Typically, human error is behind a data leak. An excellent example of a data leak is a misconfigured Amazon Web Services (AWS) S3 bucket. S3 buckets are cloud storage spaces used to upload files and data. They can be configured for public access or locked down so that only authorised users can access data. It’s common for administrators to misconfigure access, thereby disclosing data to any third party. Misconfigured S3 buckets are so common that there are sites that scan for misconfigured S3 buckets and post them for anyone to review.

What Causes Data Leaks?

A misconfigured AWS S3 is just one example of an underlying issue that causes data leaks, but data can be exposed for a myriad of other misconfigurations and human errors. The line is blurry between data breaches and data leaks, but generally, a data leak is caused by:

  • Unpatched infrastructure: As developers are aware of vulnerabilities, they deploy security patches. Without them, software could disclose data to unauthorised users.
  • Weak security policies: Data can be disclosed unknowingly when security policies do not block unauthorised users.
  • Misconfigured firewall: Firewalls are supposed to block traffic from reaching internal resources. However, a misconfiguration can open ports and applications unknowingly and disclose data.
  • Open-source files: Some developers include hard-coded credentials and access keys in public repositories that can be used by a third party to access data.
  • Vendor issues: Vendors with access to secure data could be a vector for an attacker to access confidential data.

How Do Data Leaks Happen?

Although the list isn’t exhaustive, administrators make common mistakes associated with data leaks. Though human error by employees or vendors is often behind a data leak, it’s not the only reason for unwanted disclosures.

Factors that cause a data leak include:

  • Infrastructure misconfigurations: Most data leak incidents involve misconfigurations internally or in the cloud. Misconfigured security applications can also enable data disclosure. Enterprise, government, and small businesses are all guilty of this mistake.
  • Employee errors: Cybersecurity training is critical for employees, especially those with access to sensitive data. Malicious disclosure or unintentional human errors from employees mishandling data are common in data leaks.
  • System errors: Unexpected system errors can default to open access to unauthorised users. These errors can even persist on search engines allowing easy discoverability for unauthorised users searching for publicly disclosed sensitive information.

Types of Data at Risk

Organisations don’t want any data disclosed to an unauthorised user, but some data is more sensitive than others. It might not mean much for a product table to be disclosed to the public, but a table full of user social security numbers and identification documents could be a grave predicament that could permanently damage the organisation's reputation.

Examples of data that could be disclosed after a leak include:

  • Trade secrets or intellectual property stored in files or databases.
  • Private proprietary source code.
  • Current product and inventory status, including vendor pricing.
  • Proprietary research used for product improvements, patents, and inventions.
  • Sensitive customer data, including health and financial information.
  • Employee data, including social security numbers, financial information and credentials.

How to Prevent Data Leaks

Data protection strategies should always include employee education and training, but administrators can take additional steps to stop data leaks.

Here are a few ways you can prevent a data leak incident:

  • Audit and classify data: It’s common for fast-paced growing businesses to lose track of data and its storage locations. Without knowing where data is located and discovering applications and users who move data, it’s impossible to cover all your bases. Classifying data also reveals employee permission misuse and potential data leaks from unnecessary access.
  • Be proactive: A risk assessment and management help identify risks and provide administrators mitigation strategies. That may require additional security measures, policies, and employee training.
  • Protect data based on value and sensitivity: Data leaks on unimportant data are not ideal, but it’s far less damaging than sensitive data disclosure. After an audit and discovery of data, focus efforts on the most valuable data first. Using a data discovery software can help with this because it is able to provide dependable and automated content analysis and track information across your network.
  • Offer cybersecurity training: Education has been shown to reduce the chance of human error from phishing or social engineering. It also helps employees know how to properly manage data and data protection.
  • Monitoring: Deploying the right monitoring tools helps administrators identify anomalies faster and makes them more proactive in containing and eradicating a threat. Some tools also identify misconfigurations and potential data leak issues.
  • Have a disaster recovery plan: Disaster recovery with backups will restore destroyed data. A recovery plan includes the people involved in data recovery and the many steps to communicate with affected customers and any news outlets.

Example Data Leak Scenarios

To better design security infrastructure around sensitive data, it helps to know common scenarios where data leaks occur. You may not even identify scenarios until they happen to your organisation. Here are a few ways an organisation could be victim to a data leak:

  • Employee brings files home from work: There’s a reason why larger corporations lock down USB drive access. Employees might think it’s harmless to take their work home and store data on their devices, but it can lead to a data leak should the device get lost or insecurely stored.
  • Unencrypted data storage: Users and attackers could obtain unencrypted data either from a permission error or accidental transfer to publicly accessible cloud storage. Data sent in instant messages or emails are also vulnerable if unencrypted.
  • Password misuse: Employees that write down their passwords or store them insecurely could disclose them accidentally to a third party. Strong passwords are key to preventing breaches and data loss, which is why it’s so important to educate your people on password awareness and best practices.
  • Outdated software: Developers patch software with known vulnerabilities, but administrators must take the initiative to install them. Security patches should be installed immediately, or attackers could take advantage of vulnerable systems that store data.
  • Software misconfigurations: When software is not configured properly to store files or data, it could openly disclose data without administrators being aware of the issue.
  • Development server compromise: Development environments are often loosely protected, but production data is replicated to the development server for access by developers. That might seem harmless, but developers could potentially configure the server or the environment to disclose data.

Examples of Real-World Data Leaks

General scenarios help with data governance and risk management, but even large corporations fall victim to threats. Here are a few examples of large organisations or government entities that fell victim to data leak risks:

  • The Veterans Administration lost 26.5 million records with sensitive data, including social security numbers and date of birth information, after an employee took data home.
  • Idaho Power Company in Boise, Idaho, was victim to a data leak after they sold used hard drives containing sensitive files and confidential information on eBay.
  • Loyola University computers containing sensitive student information had been disposed of without wiping the hard drives. The result was the disclosure of social security numbers and financial aid records.
  • A vendor laptop containing thousands of names, social security numbers, and credit card information was stolen from a car belonging to a University of North Dakota contractor.
  • An error in a Texas University’s software allowed users with access to also access names, courses, and grades for 12,000 students.

How Proofpoint Can Help

Identifying misconfigurations and gaps in data loss prevention (DLP) requires staff that knows how to monitor and scan for these issues. Many organisations don’t have the personnel to properly plan for disasters and build infrastructure to secure data from unintentional data leaks. Proofpoint can take you from start to finish to design a data loss prevention plan and implement it. We have information protection experts to help you classify data, automate data procedures, stay compliant with regulatory requirements, and build infrastructure that supports effective data governance.