This article can also be found in the Premium Editorial Download "Storage magazine: Better disaster recovery testing techniques."
Download it now to read this article plus other related content.
Security is a major concern for any grid. Only authorized users can access a grid, and data grid transmissions can be encrypted. Strong encryption authentication based on public key infrastructure (PKI) is used. One advantage of grid services is that security is built in from the start--user IDs/passwords aren't required for every site entered and secure authentication is maintained. Grid administrators can also set up access constraints. For example, on the Earth System Grid (ESG), administrators limit the amount of data and/or files users can download to effectively govern overall grid activity.
|Grid software licensing issues|
Most standard computer-resource accounting models break down when applications run across a distributed grid, so another issue facing grid use is software licensing. It's hard enough ensuring proper software licensing for all of the computers in one data center, but try doing this for 14,000 CPUs across 135 data centers. Grid use has evolved based mainly on open-source applications with generous licensing terms.
A Directed Acyclic Graph (DAG) structure is used to schedule Condor-G grid work. A DAG can coordinate the steps in a computation and/or the data grid accesses needed to supply the processing/data being requested. Because it may take a number of steps to process the data, they can be broken down into manageable execution steps so the process can be easily restarted if a failure occurs. Automated resource managers on the grid take the DAG and parcel out the work to grid nodes supplying the requested services.
ESG users access a Web portal to search a directory, and can specify what timeframes and data slices they need. The ESG schedules the data extract and sends it to the requestor; it also provides a list of locations where file replicas may be found, allowing users to choose the one they'd like to download.
Implementing Globus data grid services
To join an open grid service, you need to be pretty computer-savvy and patient. It took me the better part of a day to log on to the Globus data grid. My test installation was limited to data grid services (GridFTP, Reliable File Transfer [RFT] and Replica Location Service [RLS]) even though Globus Toolkit version 4.0 (GT4) supports computational services.
Software components required to install a Globus data grid include Java SDK, Apache Ant, a C compiler, a database, GNU make and GNU tar. A JDBC-compliant database may be needed for some grid Web services. Tomcat can be used as a Web server or the Globus Toolkit has a standalone Web service container that can be used.
Security requirements for data and compute grids are complex for obvious reasons. Globus security infrastructure depends upon public key infrastructure, host and user security certificates, and a certificate authority (CA) to validate them. The best option is to use a currently supported CA where available. If none exists, SimpleCA from the Globus Toolkit can be used for test purposes. Although installing the security infrastructure was complex and time-consuming, the advantages were immediate. Activities that normally took additional logins were authenticated and approved automatically by grid services using security proxies.
With all prerequisites in place, the build of the source code took over three hours. Setting up certificates for two machines and two users, fixing permissions and other middleware such as proxies and grid-mapfiles took the better part of a day. Afterwards, GridFTP was used to FTP a file from one machine to another. It wasn't until after this completed that I noticed no login was required--GridFTP and grid security provided automatic authentication. RLS was the last service deployed. The key to a successful RLS implementation is to set up the environment and database linkages. RLS provides the mapping between logical file names and physical file locations. One logical file could potentially have hundreds of physical locations on the grid, and RLS can be used to catalog all of the files.
Storage on data grids can be managed in many ways. One popular approach uses storage resource managers (SRMs) to manage files, disks and archives. An SRM can enforce space quotas and other storage constraints. Files can be pinned (reserved) by an SRM. While pinned, a file can't be removed from an SRM's control, but it can be accessed by multiple users. File pins may be released (unpinned) or left to timeout, after which the file is no longer guaranteed to be available.
SRMs also support dynamic space management. A grid user can request, for example, 200GB of space to be reserved by an SRM. As long as space is available this reservation will be held; if additional requests come in, the reservation may be reduced, assuming there aren't already pinned files in the space. The SRM works with DAS, SAN or NAS storage, anything that supports file storage; but because of the unique nature of data grids, software licensing is an issue (see "Grid software licensing issues," this page).
Local grid data and meta data can be backed up with any standard backup package; non-local data has to be retrieved at your site to be backed up locally. Replication of data and meta data across the grid can also be used for backup purposes.
This was first published in October 2005