Like most professional sports teams, the Boston Celtics have long dealt with a flood of data. But all the statistics on points, assists and rebounds, and even business information, don't consume more than 1 terabyte of the team's nearly 100 terabytes of data storage.
Digital video storage is the largest percentage of the Celtics' data, while audio, photographs and graphics take up space to a lesser extent. With an expected growth rate of at least 20 terabytes (TB) per year, the team is weighing a shift from scale-up to scale-out storage, according to Jay Wessland, vice president and CTO for the Celtics.
The Celtics currently use EMC's VNX scale-up unified system for primary storage and an older Clariion AX4 for disk-based backups and archives. The team is considering EMC's Isilon NAS among its scale-out options.
"We're going to take a step back, really look hard, evaluate and talk about a long-term strategy," Wessland said. "Rather than spending a little bit of money now to try and do some minor upgrades, we're really going to try and hold off. In the next off-season, we'll do a major upgrade and make a lot of changes."
Wessland took time out at last week's CIO Summit, hosted by Extreme Networks at Gillette Stadium in Foxboro, Mass., to discuss the Celtics' strategy and plans for video and other data storage.
What have been your greatest challenges with data storage over the last few years?
Jay Wessland: Challenges have been just the explosion of digital video. It dwarfs everything else we do. We have a lot of other rich media as well -- audio, graphics and still photography. But the digital video is what kills everything else. It grows [by] terabytes almost weekly it seems. So, it literally dwarfs everything else that matters in the data center.
The good news is that our marketing partner and sponsor is EMC, so I have help. It doesn't mean I get free storage from EMC, but it means I get good advice. We had a big summit at EMC with a lot of their different departments a week ago and talked about where we were and what makes sense long term to see if we can get ahead of it rather than just continuing to throw more disks at the current boxes.
How does the team use video?
Wessland: There are two parts of our business. On the admin/business side, it's game video, interview video and event video that is used on Celtics.com, YouTube and social media. We sort of run a TV station, but we don't broadcast it over the air. We broadcast on the Internet.
There's also the basketball side, which has a lot of similar video, but it's very targeted at scouting -- other teams and other players as well as self-scouting. We call it opponent scout and self-scout. [The video is] in a different location and stored separately.
What do you currently use for storage?
Wessland: Primarily VNX. There's a bunch of storage that's iSCSI-attached to VMware and is block based, and most of the digital video [storage] is file based, just SMB shares, so they can be used by PCs and Macs.
Keeping video on scale-up, file-based storage sounds expensive to expand.
Wessland: Yes, it's expensive. But the other concern is [that] it's in one place, and to protect it, you have to buy it twice and put it in two places and connect the two places. So, we're looking at a more scale-out model. EMC likes to call it a data lake model. We're potentially looking at EMC Isilon. The Red Sox have leveraged Isilon pretty heavily and have had very good luck with it, so we may follow that.
Is storage tiering in your plans?
Wessland: Definitely. Tiered storage is what we've been looking at to try and put the data that's at rest, which is an awful lot, as far out in the tiers and in the cheapest storage as we can and then keep the current stuff on faster storage. All of that you can do within the Isilon framework.
Jay WesslandBoston Celtics VP & CTO
Right now, we overkill everything. It's all in the VNX. It's got plenty of performance, but a lot of it doesn't need to be as close as it is. It just needs to be somewhere. And it's too big to go on tape. That's the thing. We don't have a good archive medium anymore.
Everybody says, "Oh well, you just back it up to the cloud." Well, back up to the cloud sounds good when you're a home user and you've got a couple gigabytes of pictures your grandma took. But in reality, when you're talking about hundreds of terabytes, backing up to the cloud probably isn't as reasonable as it sounds because of bandwidth. You can buy storage cheaply in the cloud. It's getting it there and keeping it updated.
How do you back up and archive now?
Wessland: I don't have [a] good archive right now. Anything that's really important, we just try and store it twice, which is not a great plan. But that needs to get fixed with a real archival system.
In your case, you probably need to retain the archives forever.
Wessland: Yeah, absolutely. We don't want to get caught where we've been, where we've got a bunch of VHS tapes, and you can't really do anything with them anymore. We need to keep it somewhere where we can access it. But the reality is that it's rarely accessed. How many times do you need to go back to that 1968 footage of Celtics vs. 76ers? But you don't want to throw it away. You want it somewhere.
Some of that could potentially go to the cloud. One of the things that people talk negatively about storing in the cloud is [that] retrieval costs can sometimes be expensive. But given that we rarely retrieve it, and that we really just want it there in case we need it that one time, the retrieval cost isn't that big a deal. So, the cloud will come into play, I think, for some of this long term, and especially as bandwidth costs come down, which they do all the time, with our other partner, Comcast.
Do you have any concerns about using cloud storage?
Wessland: Oh sure. There are data privacy concerns and data protection concerns. I'm not as concerned about that as the advantages of getting out of the infrastructure business and letting somebody who's in the infrastructure business run the infrastructure. I think the advantage outweighs the concerns.
Is security a concern with the cloud?
Wessland: We're definitely very concerned about it, especially on the business side. We do store personal data. So, we protect that very carefully, and we will continue to, but it's not the concern that the medical industry has, for instance.
Has the intensified need for data analytics caused you to make changes to your storage?
Wessland: The data on the analytics side, although significant, is not huge. It's not really big data, Hadoop kind of stuff yet. It grows every year and is starting to mushroom as we add more and more things into that data set, different data feeds, different types of data. But it hasn't reached that sort of terabyte growth the digital video has yet. It's still in manageable gigabyte growth.
As we're doing all this other storage work for the digital video, we're keeping an eye on the analytics side, too. We need to someday get out of our old-school, Microsoft SQL-based storage of our statistical data. It might be Hadoop. It could be Oracle or SAP. Or who knows what it's going to be?
What are you doing with data analytics?
Wessland: We do a lot of analysis of in-game statistics for our own games, for opponents' games, for overseas games, college games to try and analyze team-based play and individual player-based plays. We've got two math guys and a software developer that do nothing but basketball analytics. And the other place is fan and ticketing data to try and optimize ticket sales and keep the fans as happy as possible.
Do you call that data from the VNX?
Wessland: Yes. That is also backed up in the cloud. We replicate that to a database server in the cloud. So, if we happen to be doing it with a third party or some system that's out in the cloud, we'll generally call it from our Amazon [Virtual Private Cloud] VPC. If it's an internal user trying to pull up a report on his computer, that generally gets called out of our local server.
Do you use flash storage?
Wessland: Performance isn't usually what's driving me, because we're such a small organization. But I see flash long term as a big win for storage because of power and cooling and eventually space [advantages]. I think we can store a lot more data a lot more efficiently, especially the at-rest data. We're spinning all these disks to keep this data that's at rest, and it makes much more sense to store it on flash if it were cost effective.
I think that flash will play into it, but it's still a few years out when it starts to take over the archive role. People don't look at it for [archiving] so much. Right now, the flash vendors have focused on performance. For someone like me, I'd rather someone take an archival look at flash because we've run out of space on the old archival platforms, tape and optical.
Storage options for streaming large video files
Storage framework for media and entertainment environments