In this SearchStorage podcast, Mesabi Group principal analyst David Hill discusses the impact of linear tape file systems on the media and health care industries, and why health care has yet to embrace the cost savings and efficiencies often associated with the technology.
Can you give me a brief overview of the history of Linear Tape File System (LTFS) technology?
David Hill: First of all, you know that magnetic tape has been around for a long time. But one of the problems with magnetic tape was that you stack files sequentially and have an index on a database somewhere -- for backup and restore, for example -- but the metadata wasn't tied in with the data itself.
So the tapes were not interoperable. You couldn't move them around. You couldn't bring [tapes] up without having the backup systems in the same place. IBM went to work and put together a technology prototype for what later became LTFS. Then they worked with the LTO Consortium, which also includes Hewlett-Packard, to put together a specification. That spec came out in April 2010.
The media and health care industries are popular places for implementing linear tape file systems. Why are these industries leading the way?
Hill: One reason is that they did a lot of work with analog devices, and digital is a much easier way of doing business. They're also starting to have much greater volume, especially with the digitization of movies. For example, Avatar went up to about a petabyte. Finally, if you're in the field and taking pictures/video and using LTFS and a tape drive [to store the data], you can ship that tape drive somewhere once you're done. You don't have to tie up valuable network time.
The Picture Archiving and Communication System (PACS) used to store MRI data is frequently associated with LTFS. Why is this?
It isn't happening that often yet. LTFS is still new; it's only three years old. When you get into the PACS systems, you have to start using more 'solutions' -- QSTAR has one for archiving, and Crossroads has Strongbox and is working with Fujifilm's Permavault on medical applications. You need something to put [these applications] together. For example, let's say a person has multiple images -- X-rays, CT scans and MRIs -- and they're all stored on tape. What you want to avoid is the search process from the front to the back, to the middle, and back and forth. So, with PACS systems, you need some extra software on top to do that search correctly.
With PACS, you're going to have to store data for a long, long time. You're also going to have to copy the data. The advantage of doing that with linear tape file systems is that you can transport the tapes to another location, and it's easier to bring up the data because it's interoperable. It's not tied to a backup or restore system.
Is it too early to tell whether there are real differences between LTFS for health care and LTFS for media use cases?
There are some [differences]. You don't have regulation in the media and entertainment industry; players in the health care industry have to be more cautious about how they move forward. The level of pain is higher in the media and entertainment business; when you have that, you adopt things more quickly. You have competition in the media and entertainment business. What is your competitor doing to get ahead of you? What kind of technologies are they adopting? The health care industry is not as fast-moving as the media and entertainment industry because they don't have this competition.
Then there is the cost focus. The media and entertainment industry is very focused on that. There is a focus on cost in health care, but health care isn't under as much immediate pressure as media and entertainment companies. So those are your four [differences]: regulation, level of pain, competition and cost focus.
Can linear tape file systems play a role in big data or is it too difficult to integrate into big data processing architectures?
It depends on what kind of big data you're talking about. Some big data is used in real-time analysis, like fraud detection. [LTFS is] not going to be used in that. [But some] big data is kept around for long periods of time. For example, if you're doing genome research and taking DNA samples with a lot of data to be saved, and if you don't need it for a long period of time, then tape is the best place [to store it]. You might say tape has reign of the long-tail distribution. The long-tail distribution means you don't need to retrieve data quickly, and you can stand latencies of 30 to 40 seconds or even two-and-a-half minutes. Big data applications can mean using a lot of data, but not every day. This is when it makes sense to put it on tape. So yes, active archiving and tape will play a big role in big data.