Problem solve Get help with specific problems with your technologies, process and projects.

The problem with unstructured information

Information security expert Kevin Beaver highlights the problem of managing unstructured information and what you can do to keep your organization on the right track to storage security and out of hot water.

What you will learn from this tip: Information security expert Kevin Beaver highlights the problem of managing unstructured information and what you can do to keep your organization on the right track to storage security and out of hot water.

There's a problem in the storage world that's getting out of hand. It's the issue of all the unstructured files -- a large percentage of which contain what could be considered sensitive information -- strewn across the network on every storage device imaginable. There are literally thousands and thousands of files containing sensitive text strewn across practically every network unaccounted for, unclassified and unprotected. Sensitive information spread around the enterprise is not necessarily a problem in and of itself, but once you've thrown in all the information protection requirements mandated by governments and industry groups from all levels, you've got yourself quite an issue to manage.

Practically every business in the industrialized world is directly or indirectly affected by information privacy and security laws and regulations. Considering what's at stake, you have to not only identify where your information risks are (i.e., where sensitive information is located on the network), but you also have to classify it, figure out who needs access to what, implement proper access controls and then make sure only those authorized to access it are indeed the ones doing so. Whether you're a storage administrator, network administrator, compliance officer -- or all three, this affects you in so many ways.

Virtually every computer system has sensitive information stored on it. That's what they're made for, right? Whether this unstructured information is in the form of Word documents, Excel spreadsheets, text files, PDF files or flat-file databases, based on what I'm seeing, I'd venture to bet you've got sensitive information in every nook and cranny of your network. Information and locations you've never thought about or never knew existed.

Until recently, the unstructured information dilemma has been off the radar of most security vendors and IT professionals. Perimeter security controls consisting of firewalls and intrusion detection systems have been all the rage. The premise (and misconception) has been that if you keep the bad guys off of the network -- especially the storage area network (SAN) -- nothing bad will happen. However, this simply isn't reality. There is just as much, if not more, malicious use and abuse by trusted insiders. Fueling this problem is the complexity of our applications and information systems, as well as users who are unaware of how their actions can cause problems, such as:

  • Applications that leave temporary files in common and not so common locations on local workstation drives, such as temp directories, the root of the C: drive, program installation folders. It's anyone's guess where information may or may not be stored.
  • Laptop computers with entire databases, synchronized copies of server shares and other local work spread all around their local hard drives. And don't think user passwords and file access controls will keep this information protected. Laptops can be hacked all too easily as outlined step-by-step guide to preventing laptop hacking.
  • Users copying files from the server to their local Windows desktop "real quick," so they can work while they travel.
  • Users habitually saving sensitive information locally that should be stored on a network so it can be backed up and controlled by the storage or network administrator.
  • Mobile devices, such as PDAs and smartphones, housing sensitive files with the only protective measure being the assumption that its owner will keep it guarded and physically protected at all times.

A quick and dirty test you can run to demonstrate the problem is to use a text search utility and look for sensitive text (i.e., date of birth, Social Security number, credit information, etc.) stored in text-based files on various server shares, local drives and elsewhere on your network. For example, I like using Effective File Search and FileLocator Pro to look for sensitive information while logged in as a standard user as shown in the diagram below.

Look for sensitive information while logged in as a standard user in FileLocator Pro

Obviously, running this search as an administrator, or root equivalent, will not likely turn up more information than such an account would already have access to, but it can still highlight access control misconfigurations. This is especially true for information that only a select group of privileged users should be permitted to access. You can also use Google's desktop search capabilities, or even your favorite operating system search tool (i.e., find, Windows Explorer, etc.) to search network drives. Keep in mind that discovering sensitive data doesn't automatically equate to business risk -- it all depends on the context. The bottom line is that I'm always astounded by what I find stored -- unknown and unprotected -- all across different networks using this method.

It's one thing to have your SAN or network attached storage (NAS) environment "locked down," but it's quite another to have that same information spread across your network in an unmanageable, unclassifiable and unaccountable fashion. Storage security is coming of age and now's the time to start thinking about reining in your sensitive files both inside and outside of your traditional storage boundaries by identifying, classifying and applying whatever access controls are needed to keep it reasonably protected. Don't fall into the deadly trap of instituting a written policy and assuming that your users will automatically abide by it and store sensitive information in the proper places. Users will undoubtedly go down the path of highest convenience pushing security aside if it gets in the way of them getting their jobs done.

If you've got anything more than a handful of servers and a few mobile devices, the only way to reign in your unstructured information and keep it protected is to use technology to your advantage. Set yourself, your users and your organization up for success and implement one of the relatively new information discovery/classification technologies being offered by a growing number of storage and third-party providers. That, and some ongoing subtle reminders to your users, will do wonders to keep this problem to a minimum.

Do you know…

How to manage data among storage tiers?

About the author
Kevin Beaver is an independent information security consultant and expert witness with Atlanta-based Principle Logic, LLC. He has more than 18 years of experience in IT and specializes in performing information security assessments revolving around compliance and IT governance. Kevin has written six books, including Hacking For Dummies (Wiley), Hacking Wireless Networks For Dummies and The Practical Guide to HIPAA Privacy and Security Compliance (Auerbach). He can be reached at [email protected]

Next Steps

Kerberos and its place in NAS authentication

Protect your data from hidden threats

Compression, deduplication and encryption: What's the difference?

Dig Deeper on Unstructured data storage