Mopic - Fotolia
The Tahoe Least-Authority File System, or Tahoe-LAFS, is an open source cloud storage option designed to address common security and reliability concerns with storing data in public clouds. Just as a RAID array stripes data across multiple disks, Tahoe-LAFS stripes data across multiple cloud storage providers. Security is improved because individual cloud storage providers store only data fragments. Tahoe-LAFS also enhances reliability because data is stored with sufficient redundancy to guard against the failure of one or more providers.
Data storage redundancy is achieved through a technique known as erasure coding. Erasure coding is based around the idea that it is possible to specify the total number of drives (or, in this case, cloud providers) that can fail without impacting the functionality of the file system.
Erasure coding uses the variables K and N. K refers to the number of providers required to be functional at any given time, while N is the total number of providers used. Hence, recovery goals can be expressed as K of N. Put into practice, each of your N cloud providers will store a volume of data that is equal to the total size of your data set divided by K.
To further illustrate this concept, let's examine the default Tahoe parameters in which K=3 and N=10. These values, which can be changed, specify that 10 different cloud service providers are being used, and that up to seven of them can fail at any given time. Conversely, three providers must remain online for the file system to remain functional.
Now suppose you needed to store 1 TB (1,024 GB) of data in the cloud (using the default Tahoe-LAFS parameters). Each of the 10 cloud providers will need to store enough data to insulate against the failure of any seven servers. The volume of data that must be stored on each server is the total size of the data set (1,024 GB) divided by K (3). In this case, that would mean that each of the 10 cloud providers would have to store approximately 341.3 GB of data.
It is important to consider what this level of reliability does to your storage costs. Cloud storage providers charge based on the volume of data being stored (some also charge for input/output). Using the example above, the redundancy requirements would triple the total volume of data being stored in the cloud (3,413 GB spread across 10 providers instead of 1,024 GB stored on a single provider).
Varying approaches to open source clouds
Open source cloud options expand
Dig Deeper on Public cloud storage
Related Q&A from Brien Posey
Are you ready for your next data migration project? Take the Myspace migration failure as motivation to implement data source and backup best ... Continue Reading
IT shouldn't overlook the benefits of on-premises desktop hosting, such as more choices for virtualization software. Cloud-hosted desktops also fall ... Continue Reading
The threat of ransomware has evolved over the past few years, causing many organizations to re-examine how they plan to recover data in the event of ... Continue Reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.