Deep dive into SharePoint data recovery: HOT SPOTS

Microsoft's popular collaboration application presents unique backup/recovery challenges, especially when it comes to protecting the data in a way that permits granular recovery.

This article can also be found in the Premium Editorial Download: Storage magazine: What's in store for storage technology in 2009?:

Collaboration app presents unique backup/recovery challenges, especially when it comes to granular recovery.

microsoft office sharepoint server is becoming a popular enterprise application as companies seek to enhance collaboration across enterprises. But an increased dependence on the platform means it's crucial to establish and maintain business continuity and disaster recovery (DR) strategies.

ESG research finds that 36% of SharePoint users believe their backup processes don't provide an adequate level of protection. And 36% say a SharePoint downtime of 60 minutes or less would have an adverse affect on their business. SharePoint (a collective reference to Microsoft Office SharePoint Server, SharePoint Portal Server and Windows SharePoint Services) is a Web-enabled collaboration platform and central repository for unstructured and semi-structured content. It enables file sharing, document workflow and publishing automation, document management with version control, and search and access control for distributed users. The amount of data stored in SharePoint can be staggering. Native backup and recovery tools are inadequate, and item-level recovery can be a challenge.



SharePoint capacity and complexity
SharePoint usage can lead to an increase in primary and secondary storage capacity. Documents stored in shared folders on file servers will be replaced with managed document repositories. And data previously stored only on desktops and laptops may now be centralized in SharePoint. This is especially true if companies adopt SharePoint as an aggregation point for email attachments. Finally, the versioning feature in SharePoint causes multiple copies of data to be stored. ESG research found that, in terms of annual storage capacity growth, 37% of those surveyed experience a 10% to 30% increase, and one-third have an increase of 30% or more. This also impacts the size of future backups, as 68% of SharePoint research respondents note increases in this area.

In its simplest form, SharePoint can be configured for all services to run on a single server. A popular SharePoint deployment might consist of a small three-tier server farm with a data tier consisting of back-end SQL Server databases where all of the content is stored, a Web tier to deliver content to users, and an app tier hosting background services and apps. Additionally, data related to the farm and its components resides in the configuration database (one per farm). Depending on factors such as implementation size and availability requirements, the deployment scenario selected could translate to a distributed farm configuration.

There are multiple components to each tier that need to be protected. Although the bulk of SharePoint information resides in a SQL database, there are additional files that should be backed up to fully protect a SharePoint environment, including Internet Information Services (IIS) meta data, front-end data, search indexes and customizations. It's best to protect all of these components in a federated way.

@pb

Backup/Recovery
When it comes to safeguarding SharePoint, protection occurs on two levels. DR strategies ensure the entire SharePoint environment can be recovered if something happens to the primary site, while operational recovery strategies provide recovery of a component of a SharePoint site.

SharePoint has two features to guard against accidental deletion of data: version history and a recycle bin. SharePoint's recycle bin has an additional safety net in the form of an administrator-accessible Site Collection Recycle Bin. Unfortunately, versioning and the recycle bin don't address other types of data loss, such as errors or corruption. Also, if an entire document library is deleted, the entire library must be restored. These are two big themes with SharePoint backup and recovery: SharePoint is an application that's challenging to back up and, even when that's successful, recovery of any site data at a granular level is an ordeal. Here's an overview of ways to protect SharePoint content.



Native tools: Native SharePoint backup/recovery tools exist, but they're not without their shortcomings. There's a command-line tool for backup (stsadm.exe) that performs a site-level full-fidelity backup (not to be confused with the smigrate.exe migration utility that can make a copy of a site, but doesn't guarantee all customizations and settings won't be lost). Command-line utilities, unless part of a batch job, can be frustrating to administer because they have to be run from the local server and are prone to human error. Creating scripts to kick off the backup process and leveraging the Windows Task Scheduler to establish a regular backup schedule for the batch job can overcome these challenges, but there are a few more issues to deal with. First, some data can't be protected with the command-line utility, including the IIS metabase. Second, indexing is suspended while the command-line backup executes, which means that anything new added to SharePoint during this time isn't available for search until after the job is complete. Finally, the issue of granularity remains an all-or-nothing process. If a single item needs to be recovered, the entire site must be restored to an alternate system and the single document then re-introduced to SharePoint.

SharePoint offers another native tool for backup and restore, the Central Administration user interface. On the plus side, this approach lets the backup be performed for the whole farm down to the content database (with recovery offered at the same level of backup) and provides a choice of full or differential schemes. The downside is that scheduling backup jobs isn't possible and retention management is a manual process.

SQL Server backup: Another popular approach to protecting SharePoint content is to simply perform SQL Server backups. Unfortunately, this approach will leave some SharePoint components unprotected, including IIS meta data, Web front-end data and search indexes. Having a SQL backup ensures full fidelity restore of the database and its content. But these need to be completed with backups from the file system to recover the whole farm in a DR situation. It's important to note that a SQL backup doesn't help with restoring individual items. You'll have to restore the SQL database with a different name, attach it to another farm and restore items to production. This approach may only be suitable for larger sites that have SQL Server tools and a database administrator.

Commercial solutions: Third-party backup and recovery software is another way to go, especially to address the limitations in the native utilities and SQL backup method. Many third-party backup vendors offer some capability for protecting SharePoint, either through a dedicated module for SharePoint backup or a SharePoint app-specific agent that works in conjunction with the backup engine. A backup app can handle the database and system-level protection required of the various SharePoint components. These solutions will deliver automation and customization of backups with features like full, incremental and differential strategies; compression; encryption; direct and indirect restore; retention policies; and expired data clean-up. And many of the backup platforms offer capabilities for creating offsite copies through replication or tape creation.

The most glaring omission for many backup vendors' solutions is granular recovery. To date, only a handful of products offer item-level recovery that doesn't involve a two-step recovery process. These solutions come from a number of vendors, including AvePoint, CommVault, IBM, NetApp and Symantec. Others may redirect the recovery and create a duplicate copy, which will require more storage capacity and necessitate the manual movement of content between duplicate and original sites.

SharePoint usage will likely continue to grow. As the amount and value of data stored on this platform rises, implications for backup and recovery become critical. Taking recovery time and recovery point objectives for SharePoint data into account, it's likely that the top criteria for selecting SharePoint backup will be full fidelity protection, automation, and the ability to recover at site, sub-site and item levels.

This was first published in December 2008

Dig deeper on Storage Resources

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

Close