Hot Spots: Web 2.0 storage: Challenges and choices

Web 2.0 tools and strategies hold many potential benefits for businesses that deploy them, but their requirements for rapidly scalable storage and access, as well as persistent data, pose significant challenges for the IT staffs that need to build and manage the infrastructure.

This Content Component encountered an error
This article can also be found in the Premium Editorial Download: Storage magazine: RAID turns 20: Do you still need it?:

Storage managers need to anticipate the demands of Web 2.0 applications as they take their place in enterprise environments.


If you need evidence that Web 2.0 has gained widespread acceptance, look no further than eBay, Facebook and YouTube. These successful business models frame how we typically think of Web 2.0. They're highly collaborative, interactive and strive to reach a broad audience with mostly user-generated content.

These days, Web 2.0 isn't limited to twenty-somethings building an application in the basement; traditional brick-and-mortar organizations should also consider this new way of doing business. At the enterprise level, internal applications like instant messaging, Microsoft SharePoint and wikis all enable improved communication and information sharing. In many cases this extends to trusted partners and suppliers. Now consider the amount of storage this content (RSS feeds, wikis, blogs and more) creates, as nearly all Web 2.0 models require storage on some level.

Web 2.0 tools and strategies hold many potential benefits for those businesses that deploy them, but they also pose significant challenges for the information technology staffs that need to build and manage the infrastructure. IT managers are struggling with the cost and complexity of managing multiple interfaces to meet the demands of their business. Much of the current infrastructure wasn't designed to handle Web 2.0 application requirements at a price point that enables a company to deliver a profitable service. Web 2.0 applications will only further stress the system as they require the following:

Rapidly scalable storage and access. User-generated content grows at an unchecked rate--just consider the success of YouTube. Think how that infrastructure will be further impacted when high-definition video becomes prevalent. How about growing from zero storage to petabytes (PBs) of storage in just a few months? At the same time, hundreds of thousands (and potentially millions) of users will attempt to access that information. The combination makes for a complex IT equation.

Persistent data (no more post it and forget it). Users generate new content every day. Whatever form the content takes (a product review, a blog, music or video files), it remains unchanged and is retained. Most (if not all) uploaded data remains online until a user cancels their account, which they often don't bother to do. This requires massive amounts of available storage.

Cost. Given the potential raw amount of required storage--which can reach multiple PBs of capacity--it's very difficult to scale a high-end storage system in a cost-effective way. Think about the best-known Web 2.0 companies and you'll see that most of their best offerings are free or very inexpensive. That means accompanying infrastructures need to have low operational costs so that the Web 2.0 business model has a shot at success.

Reliable and predictable performance. We can't talk about cost without talking about performance. While it's true that Web 2.0 services aren't typically tied to Wall Street trading apps, the customer experience is still an important factor. Users want to view stored files, images and videos in a timely manner or they'll switch to another service. Users typically become frustrated after waiting just a few seconds and give up if they have to wait too long. Service outages aren't acceptable and that type of news travels fast in the Web 2.0 world. The infrastructure supporting these applications must stay up 24/7, 365 days a year.

These days, it's clear that Web 2.0 features can benefit companies in vertical markets ranging from manufacturing to health care, retail to executive recruitment. Whether or not the Web 2.0 projects in those companies are handed to employees inside or outside the IT organization, storage professionals and their IT environments will be impacted.

When considering the storage infrastructure required for successful Web 2.0 initiatives, what's the best way to proceed? One way to answer that question is to start with a build or buy analysis.

Do-it-yourself method (build or buy)
Google decided to build its own. Other companies, such as Facebook and MySpace, opted to buy products that supported their needs. Building out the infrastructure might make sense for Google because it has specialized expertise building massively scalable parallel storage systems and a culture that supports this type of approach. However, this isn't the preferred approach for most companies. Leveraging commodity disks and building out a proprietary file system takes years and lots of resources. If you believe your company has the talent, time, support and internal buy-in to take on this endeavor, you may want to consider this option. But ask yourself this question: Can you get your company's new service or product to market faster by building it yourself or with an off-the-shelf solution?

Again, the requirements for Web 2.0 models are very different from those associated with email or database applications. With Web 2.0, you're aiming to build a user population of hundreds of thousands (and potentially millions) of users. The data will be stored forever and will probably be filled with images, audio files and movies. If you choose to buy the infrastructure, where do you begin? Do you leverage your existing storage infrastructure that was really built for a different set of requirements or do you seek out next-generation storage systems built for these types of apps? These systems need to be massively scalable, provide predictable and reliable performance, and come at an affordable price point. With the potential to store billions of files and objects, they must also be easy to manage across multiple systems or geographies. In addition, they must interface with Web-based protocols and have replication capability.

Outsourcing
One of the more interesting developments in outsourcing is related to Web 2.0. Outsourcing companies are providing not only the infrastructure and host facilities, but the services to help accelerate Web 2.0 initiatives. Companies like Amazon's Simple Storage Service (S3) and Nirvanix provide APIs to help customers handle multi-tenancy applications, file sharing, and video or image transcoding. While these services are relatively new (S3 has been around for approximately one year and Nirvanix just launched this fall), they're generating significant interest; they're basically Web 2.0 vendors that are enabling their customers to offer Web 2.0 services by using their infrastructure and processes.

Speed and agility is essential in a Web 2.0 economy and can be critical for success. It's therefore vital for IT leaders to understand and start evaluating their specific business requirements to determine the appropriate infrastructure that will be needed to provide a sustainable and successful Web 2.0 business. This will require deviating from traditional solutions and approaches. A shift in mindset, perhaps even in culture, may also be needed. You will need to consider the following:
  • If you decide to build, do you have the technical acumen and vision to build the right thing?


  • If you buy, do you go with what you know or do you take a step back and analyze the needs of a Web 2.0 world vs. your current landscape?


  • If you rent, who are the right partners? Do they have what you need and can you work with them effectively?


  • Can you rise above being a technologist and realize that core services and business processes are vital to implementing Web 2.0?
Traditional brick-and-mortar businesses will need to figure out how to capitalize on Web 2.0 because if they don't, someone else in their market will.

This was first published in November 2007

Dig deeper on Data center storage

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

Close