"Actually, there are two kinds of data -- controlled and the Wild West," said Tory Skyers, senior systems engineer at real estate firm Prudential Fox and Roach, during a panel discussion at the Storage Decisions conference here. "There's unstructured data that you don't know what it is and structured data that we have control of. My job is knowing which is which."
Brian Greenberg, director of data protection services at a financial services firm in the Midwest that he is not allowed to publicly identify, also used the Wild West analogy. He said maintaining order over data isn't easy when dealing with data from users generating millions of dollars. "I had people who just made a million dollars for the company last week tell me they want to keep data," Greenberg said. "It's hard to say no. So I charged back for data to be backed up, and I gave business units bills every month for backups. They didn't like that, so we made groups give a business case for backing up data."
Skyers made his case financially to senior management. "I showed them how much money they were wasting by not doing it my way," he said. "We look at what's business data and what's not. You would have to present a business case to me to store something on my SAN."
Marcellus Tabor, who manages the data protection team at Yahoo, took a diplomatic approach. "We're less like the Wild West and more like the United Nations," he said. "My users are very diverse. We have the big groups and then groups that are smaller in stature but big revenue producers." He said his staff and business leaders hold weekly meetings to discuss backup policy. "We decide how much of a percentage of what we need to back up is backed up," he said.
The panelists agreed that they don't always get the tools they need from storage vendors, at least not without cajoling. Tabor and Greenberg said their companies use home-grown applications to supplement vendor software, while Skyers said he primarily uses off-the-shelf software but writes scripts to fill some holes.
Yahoo's Tabor leverages the best features of its vendors' products and builds applications to do the rest. "It's public knowledge we use a lot of NetApp," he said. "For backup, NetApp is good at giving up-to-the-minute performance statistics on backup failures that our home-grown tool can't do. But with home-grown stuff, we can plug in other features we want."
Skyers uses backup software from CommVault Systems Inc. and said he particularly likes the way it exports data in XML. But, he said, "With off-the-shelf software, there's always something missing." Having "a very adversarial relationship" with vendors sometimes helps him get what he needs, he added.
Compliance adds to the chaos
Compliance with legal regulation now plays a big role in data protection and brings lawyers into the decisions about what gets kept and what gets deleted. Skyers said his company had 300% data growth this year, putting a strain on his storage infrastructure. "Next year, our data growth [chart] will be a straight line up, and it's all due to regulations," he said. "We have to store every piece of data around a transaction. But the fact that I have to keep it around so long increases the chances that somebody's going to get it."
Greenberg said his financial services company has regulated data and nonregulated data, and treats them differently for compliance reasons. "Lawyers come in two varieties: Some say destroy everything now or keep everything forever," he said. His company keeps regulated data but has a quick trigger for much of the rest. "Our traders' system gets replicated in real time," he said. "HR, we replicate once a day." But nonregulated emails get deleted in a day.