According to users, there's one big reason to go with Microsoft's Cluster Services (MSCS): it's free, and it comes with the Windows NT Enterprise Server OS. But in the default version Microsoft ships, the server
Dan MacDonald, systems administrator for Sports Depot Inc., learned about the drawbacks to MSCS the hard way several months ago when the Dell Inc. PowerVault he had been using for storage on his MSCS cluster failed.
For Sports Depot, which runs an online sporting goods store, the downtime that ensued was a disaster.
Step one for Sports Depot was to dump the Dell system. He said he isn't sure why it crashed, but it may have had to do with firmware upgrades or patch updates to his Windows systems. Either way, Sports Depot replaced it with two CX300 SANs from EMC Corp. and then set about trying to make MSCS a true storage cluster.
Double-Take's software is the only product, according to MacDonald (and Double-Take itself), that is meant to support MSCS rather than replace it completely.
"It sits in the I/O stream and sends copies of data written to the primary storage node over to the second one, as well as monitoring the health of the systems," MacDonald said. "We were worried about it holding up traffic, but so far there hasn't been any performance degradation."
GeoCluster also allows Sports Depot to separate the cluster nodes over a WAN link, another feature that Microsoft's default version did not allow when MacDonald bought it. Sports Depot, however, has its two storage nodes, each connected to two Windows clustered servers in the same data center, inside a fortified building. A spokeswoman for Microsoft said the company recently added the ability to geographically separate cluster nodes via WAN link (as of last month in newer editions).
"Basically a nuclear bomb would have to fall on our building," MacDonald said. "Hurricane Juan, one of the largest hurricanes on record in Canada, made landfall right over our data center and we kept running."
MacDonald said so far the GeoCluster software has worked well, but he would like Double-Take to change the way it does failover. When one cluster node's server fails and another one picks up where it left off from the other cluster, he said, the failover server can't attach to the primary node's storage."
"It won't cross that way -- so when the failover stops, you have to update both the server and the storage," he said. "But so far I haven't seen any affordable solution to that problem."
Double-Take brushes up GeoCluster
Double-Take announced version 4.4.2 of its high-availability software and new services for MSCS users earlier this week. The newer version, also known as Service Pack 2, includes enhanced quorum arbitration functionality, streamlined cluster setup process and improved orphaned file management.
Quorum arbitration is a process in which, in the event of a node failure, a "witness node" assesses whether it's the secondary node or the network itself that has caused the problem. Service Pack 2, according to Double-Take's director of solutions engineering Bob Roudebush, will automatically configure the witness node rather than requiring the user to do it manually.
According to Roudebush, the install process now also includes automatic detection of the elements of the cluster, and a default setup and configuration wizard. Improvements in orphan management are similar to improvements made to Double-Take's self-titled flagship replication software -- a more efficient means by which the failover software can establish which files in a transaction log have been written to a database following a failure.
Double-Take is also offering installation services and a three-day training program that provides hands-on experience in configuring and using Double-Take and GeoCluster for MSCS users.
MacDonald said he hadn't attended the training class but had appreciated Double-Take's on-site help setting up GeoCluster. He said it trained his staff during the installation, installing one server as a demonstration, "and then two of my guys set up the rest with Double-Take's supervision," MacDonald said. "It was the ideal installation and training combo."