Home > Storage Magazine > Features > Data grids for storage
EMAIL THIS LICENSING & REPRINTS
Storage Magazine

  CURRENT ISSUE  

  FEATURES  

  TOOLS, TRENDS & ANALYSIS  

  COLUMNS  

  ARCHIVES  

  SUBSCRIBE/RENEW  
 

Data grids for storage
by Ray Lucchesi
Issue: Oct 2005
printer-friendly
licensing & reprints
< PREV PAGE   |   1  |   2  |   3  |   NEXT PAGE  >
Current grid computing projects manned mostly by scientific teams offer some tantalizing prospects for general corporate computing. Imagine making your organization's data accessible throughout the world or replicating data to multiple, geographically dispersed sites--even sites you don't own or control, but with which you collaborate.

If you use traditional access-control methods, the barriers to this scenario are substantial. You could, for example, set up replicated Web FTP mirror sites with user logins and passwords to all of the sites providing access, or set up VPN access to each site holding the data.

Open-source grids
Globus is the progenitor of many of today's grids. The Globus Alliance and the Global Grid Forum (GGF) support the Globus Toolkit, and have developed some of the fundamental services required to implement a grid. The GGF is also charged with popularizing the grid by making it easier for all users to participate in grid work.

There are essentially three different modes of Globus support software: the API-based model in Globus Toolkit version 2.0 (GT2), the service model in GT3 and the Web services resource framework in Globus Toolkit version 4.0 (GT4), released last May.

There are many compute-data grids in operation around the world, including AstroGrid, the Biomedical Informatics Research Network, the Enabling Grids for E-sciencE, Grid Physics Network and the Particle Physics Data Grid.

But it isn't easy to replicate data to alternate sites with an FTP site, and user IDs/passwords become a major hassle with multiple sites. VPNs require different passwords and configurations for each data repository site, and users would certainly balk at having to navigate 10 or 100 VPN connections to get one piece of data. Another--and better--solution is to use a data grid.

Data grids
With so many storage vendors touting some sort of grid architecture these days, an accurate definition of a grid may be elusive. For the purposes of this article, a grid spans sites, companies and continents with non-proprietary hardware, software and protocols supporting authenticated access, replication and compute services. Clustered file systems don't qualify as data grids because they typically exist at one or two sites and require high bandwidth connections between nodes. Wide-area file systems come closer to a data grid model, but they don't currently offer continent spanning or multicompany hosted data; they also require proprietary hardware, software and internode protocols.

It's possible to use a grid to securely share your data and compute services. To tap into these capabilities, you need to implement standard, compliant grid services on your systems. These services are available from the open-source community; proprietary grid products are also available from some vendors, including IBM Corp., Oracle Corp., Silicon Graphics Inc. (SGI)/YottaYotta Inc. and Sun Microsystems Inc.

Data grids are perfect for organizations that need a collaborative work environment despite having diverse, distributed resources where data resides across multiple business and/or organizational domains. Data grid services allow users to access and manipulate data residing at sites around the world. Data can be retrieved from any location on the grid, and can be deposited or replicated to any location with space.

Compute grids
A compute grid can schedule computation to occur at one site with the results transmitted to another (see "Open-source grids," above), and a compute grid may exist with or without a data grid. Together, a compute grid and data grid can interoperate to move data residing throughout the grid to where computation can occur and send results wherever required.

For example, animators can publish images on a grid and provide access to other artists to supply the background, foreground and other elements. Further processing can be done on any grid-enabled system with available cycles. Results can be transmitted back to the original location or sent elsewhere for further processing. Computations can be handed from one system to another to take advantage of each node's capabilities.

< PREV PAGE   |   1  |   2  |   3  |   NEXT PAGE  >





TechTarget Storage Media
Storage Magazine View this month\\'s issue and subscribe today.
Storage Decisions Apply online for free conference admission.
SearchStorage.com
HomeNewsMagazineTopicsLearningMultimediaWhite PapersBlogsEventsAbout Us

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
TechTarget provides enterprise IT professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective IT purchase decisions and managing their organizations' IT projects - with its network of technology-specific Web sites, events and magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Reprints  |  Site Map




All Rights Reserved, Copyright 2000 - 2008, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts