Data storage compliance and corporate governance regulations are having a tremendous impact on the storage organization,...
as well as the management practices employed to retain, search, certify and destroy data. It's not just the major regulations, like the Sarbanes-Oxley Act (SOX) or the Health Insurance Portability and Accountability Act (HIPAA), which influence storage -- there are well over 10,000 regulations that affect data storage, backup and protection across a range of industries. But in spite of the many regulations that now govern records storage, there are no mandates or guidelines that dictate implementation. Companies are often left alone in their quest to identify the regulations that relate to them, identify what data should be saved and implement storage to meet those regulations. This article covers the essential goals of data storage compliance, examines implementation considerations and obstacles and reviews the impact of compliance on storage.
The goals of data storage compliance
The actual terms of each regulation vary dramatically, but storage compliance regulations typically focus on three distinct areas of interest: retention, integrity, and security.
Retention dictates how long data must be kept in storage, but stored data must also be retrievable quickly in the face of compliance audits or legal discovery. Search is a serious issue with retention -- an organization needs advanced tools to locate relevant data stored for 10 years, 20 years or longer. Data must also be readable over time, which can be a crippling problem as operating systems, email server versions or other elements of the storage infrastructure evolve. For example, email records saved today may not be readable by operating systems and applications 20 years from now -- even if the media is completely intact.
Integrity is also called "immutability;" ensuring that data has not been changed or lost because of corruption or media failure. Tape had been the traditional immutable media for many years, though optical WORM media like CD and DVD are cheaper and far more reliable. Disk-based write-once platforms, like content addressed storage (CAS), meet the demands for rapid accessibility.
Security protects sensitive data from unauthorized access. Security is typically part of the storage platform (e.g., user authentication in a CAS platform), though encryption is taking on a more prominent role for tapes and file servers. Regulators often require companies to have policies and procedures in place to manage integrity and security.
Implementing data storage compliance
All storage compliance involves hardware and software elements, but there is no single approach or architecture to rely on. For example, disk-to-disk, disk-to-disk-to-tape and disk-to-disk-to-optical storage platforms all have potential applications in compliance, but regulations rarely, if ever, define storage implementations. Organizations are left to interpret the rules, formulate the requirements and establish the technologies to meet legal obligations. This is the big flaw in storage compliance -- organizations jump to a "vendor's solution" that promises compliance, without fully understanding the laws and their impact. For example, a disk storage platform may offer reliability but might not provide the security or immutability demanded for your industry. "Buying an array for long-term retention is nice," says Jim Damoulakis, chief technology officer (CTO) at GlassHouse Technologies Inc. in Framingham, Mass. "But it doesn't make you compliant." Policies and processes must be implemented to manage storage compliance.
Similarly, there are no mandates for software tools. Analysts note that storage compliance software typically includes a discovery tool to see the data on hand and determine the candidates for archiving. Migration tools handle the actual data movement and reporting tools to track file access and user activity. The actual suite and choice of tools depends on your budget, preferences, business environment and, increasingly, the preferences of business partners. "If you've got a preferred vendor or reseller that you like to work with, you're also going to be somewhat subject to their [compliance] preferences," says Greg Schulz, founder and senior analyst at the Storage I/O Group in Stillwater, Minn.
Managing data storage compliance
Compliance will impose additional management overhead on an IT organization. Analysts note that tasks like protecting information, enforcing security and ensuring recoverability are not necessarily additional load -- well-run organizations should already be doing this. Workload is added in the monitoring, reporting and auditing capabilities needed to support storage compliance. "The thing that adds workload is that you now have to prove that you're doing it," Damoulakis says. In some cases, the cost can be substantial. Back in 2004, General Electric Co. revealed it spent about $30 million in compliance costs just to meet SOX regulations.
However, once a compliance system is chosen and installed, the additional load on IT labor can be relatively small. Damoulakis suggests that large companies may be able to justify another person to handle the compliance platform, manage backups and assist with discovery. In smaller organizations, the overhead is even less.
The impact of data storage compliance
Storage compliance has brought efficiency and automation to the forefront. IT departments can no longer afford to spend enormous amounts of human capital searching tapes and attempting to locate relevant evidence prior to litigation. Organizations are easing this burden by building intelligence around their data and understanding the business/legal importance of data rather than treating data as simply a volume of files. "Suddenly we have [in compliance] a Global Positioning Systems (GPS) when before, we had a compass that really didn't point anywhere," says Brian Babineau, analyst at the Enterprise Strategy Group in Milford, Mass.
There is also a lot more storage to deal with as organizations keep more data for a longer period of time. This realization is driving much more effective management practices. For example, if inactive data is not archived in a timely manner, it is backed up and replicated (and possibly restored) along with active data -- demanding more backup storage, along with corresponding time and costs. Tiered storage and information lifecycle management (ILM) are two technologies intended to enhance storage efficiency.
Compliance in managed services
Storage compliance issues are a primary concern for Postini Inc., an independent provider of managed email services. With over 100 terabytes (TB) of user storage currently under management, Postini must meet the same compliance standards as its clients. While regulations, like SOX and HIPAA, influence the flow and management of information, it's really Security and Exchange Commission (SEC) 17A4 that has the biggest impact. "It relates to the immutability of the data once it's been written and governs who is accessing and acting upon that data, as well as sampling and reviewing the data that's been stored," says Scott Petry, founder, CTO and executive vice president of product development at Postini.
The goal for Postini was to implement scalable storage for compliance without committing to a vendor-specific storage platform. Today, that goal is met with Archivas Inc. software supplying the compliance management layer on top of Dell Inc. storage subsystems -- though Petry underscores the idea that virtually any storage platforms would work for Postini's hardware-agnostic architecture. "We get good cost efficiency without sacrificing the regulatory or compliance components," Petry says.
A principle goal of storage compliance is to ensure the long-term integrity and availability of data. Petry says that regular sampling and recovery testing should be an integral part of compliance process -- a step often overlooked until a demand for records appears. "Companies have been fined by the SEC for not being able to produce records," Petry says. "I think regular drills and being able to pull the data and validate its integrity is critical -- and needs to be an ongoing practice along with other disaster recovery practices."
Managing for lifecycles
While the focus of storage compliance is often on saving data and maintaining its integrity, data deletion is an inevitable consideration -- prompting many organizations to perceive data in terms of its lifecycle. In Seattle's King County court system, the limited jurisdiction of civil cases mandates the destruction of court documents after a relatively short time -- usually 10 years, though some documents may need to be deleted in as little as 30 days. As a municipal entity not bound by major regulations like SOX, document retention and deletion are mandated through the court.
The King County court system has been meeting its storage compliance needs with an EMC Corp. Centera for over a year. "We'll probably end up with 3-4 TB worth of documents that we store over the next several years," says David Jones, applications supervisor for the King County court system. "And then it will probably level out about there [4 TB] because of the document retention cycles." Another 1 TB of .wav courtroom recordings is saved on a conventional server. Jones liked the notion of building custom software for court records and then accessing the Centera through its application program interface (API). Other features were also appealing. "The CAS solution [in the Centera] is very appropriate for our type of content."
For Jones, the biggest hurdle in meeting storage compliance is not the hardware or the software, but rather it's the human element moving from legacy "paper" systems in a semiautonomous series of eight courtrooms to a more centralized electronic environment. "Standardizing policies and procedures and how things have been tracked is more the issue," he says. "Making sure that everyone has actually closed out cases the same way in all of the [eight King County] courts." ***