Definition

Data archiving is the process of collecting older data and moving it to a secure location so that it can be retrieved if needed in a data forensics investigation.

Archives are distinct from backups. With archives, the data is moved to free up storage resources. With backups, working data is copied so that it can be restored in the event of a system failure or disaster.

Many compliance and regulatory standards require data archives, but they can also be useful during disaster recovery and forensic investigations.

Who Should Use Data Archiving?

Any organization that aggregates and stores a significant amount of data should archive it. Some specific regulatory standards (such as PCI-DSS, HIPAA, SOX), impose frequency and retention requirements. But it is up to each organization to determines when data should be archived, how long it is kept before it can be overwritten or destroyed, and where it should be stored. Archived data can be used for disaster recovery, but it’s mainly useful after a cyber-incident during forensic investigations.

Organizations that don’t have a lot of data can use simple backups instead of archives. But most businesses accumulate large amounts of data as they grow. Unused old data can needlessly take up terabytes of storage.

Archiving data frees up storage space for newer data, which is useful for organizations that have limited storage space. By archiving files, email messages, and database records, organizations can free up space without risking violations of regulatory standards or losing valuable information that must be reviewed in the future.

How Data Archiving Works

Outside of archiving for standard requirements, the idea that data should be archived starts with administrators determining the files and data that are no longer in use and can be moved. The type of storage space used can be cheaper and slower, but it must be secure and available when archives need to be reviewed. By moving data to a lower-cost storage area, the organization can save money and allocate faster storage for more critical data. This process can also speed up productivity by reducing the time it takes for employees to open files and access data.

Because the archived data is no longer in use, most administrators store it in read-only mode so that it cannot be altered. Creating archives in read-only mode serves to retain its integrity should it be needed in an investigation after a data breach or impropriety. It also stops attackers from changing data to hide their tracks after a compromise.

Securing data archives is just as important as keeping it unaltered. Attackers know that archives have a wealth of information on organization intellectual property, internal messages, and financial data. These archives are a target for attackers who gain access to high-privilege network accounts or exploit vulnerabilities that give them access to archive data.

The media used to store archives is up to the organization, and the decision usually hinges on its convenience, reliability and availability. Organizations have traditionally used magnetic tape due to its ability to store much more data than other media, but tape devices tend to be slower. Still, this media is still standard for some organizations that need a low-cost way to store large amounts of data in a small space.

Attached network drives are also common, but this media is much more expensive. Network storage requires the real estate to host it and expensive hardware to secure and maintain it. But unlike most tape systems, network drives offer archive data that’s readily available should the organization or investigators ever need to access it.

A third common option is cloud storage. Cloud storage has the advantages of availability and low costs, but the speed is dependent on the organization’s bandwidth and network speed. Many organizations have moved to cloud storage for its convenience and savings, but it’s still the responsibility of the organization to keep the data secure.

The process of archiving data is often automated using software. The features and capabilities offered by archiving software depend on the vendor, but most have standard features across every platform. An administrator configures the time, location, and data that must be archived, and the software does the rest. An archiving policy must be created to determine the rules behind moving data. Using archive policies, an administrator ensures that data moved to the storage location follows the right regulatory standards and requirements.

In conjunction with other rules about archiving, a retention policy is also necessary. A retention policy determines the amount of time an archive stays available before the data can be overwritten or destroyed. Typically, a retention policy for backups is about 30 days, but archived data might be retained longer before it’s destroyed. Some organizations keep archived data for years before media is rotated or archives are deleted. For the most sensitive data, archives may never be overwritten or destroyed. Archiving and compliance standards could have a retention policy requirement, so organizations should ensure that this configuration does not violate any regulatory standards.

Benefits of Having a Data Archive Policy

The two main benefits of archiving data are the cost savings and the ability to free up faster storage devices. Archived data can be stored on cheaper storage devices, provided those devices are reliable and are protected from failure. To ensure archives do not suffer from device failure, they could be a part of backup procedures or stored in the cloud where the provider ensures the reliability of the hardware.

Freeing up faster storage devices also saves the organization money. Instead of buying more storage devices, the organization can archive data and free up current space for newer data.

Administrators must know what data is being archived so that the process does not interfere with user productivity. Files that were created years ago might still be used every day, so they should not be archived. Email messages might fall under this same scenario as well. Email messages users save in their inbox should not be archived, but messages that users no longer need can be moved to an email archiving system instead of remaining on the email server.

What is the Difference Between Archives and Backups?

People often confuse data archives with backups, and the two terms are often—but incorrectly—used interchangeably. While both are important, archives and backups are used for different purposes. Here are a few key differences.

Whereas backups store a copy of data, archiving moves it to a new location to free up space.

Backups are critical for compliance, disaster recovery and business continuity. Archives, on the other hand, are necessary for compliance. Some organizations may use archived data and backups together—backing up an archive helps ensure its integrity.

If compliance requires archives, an organization should make sure that retention policies laid out by standards are followed to avoid fines. And both backups and archives should be adequately secured.

Ineffective cybersecurity defenses could make all archive data is accessible to cyber attacks. A data breach on an archive could be devastating to business integrity and brand reputation. Whether it’s backups or archives, ensure that these files are tightly secured from attackers and their exploits.