Mopic - Fotolia
The Tahoe Least-Authority File System, or Tahoe-LAFS, is an open source cloud storage option designed to address common security and reliability concerns with storing data in public clouds. Just as a RAID array stripes data across multiple disks, Tahoe-LAFS stripes data across multiple cloud storage providers. Security is improved because individual cloud storage providers store only data fragments. Tahoe-LAFS also enhances reliability because data is stored with sufficient redundancy to guard against the failure of one or more providers.
Data storage redundancy is achieved through a technique known as erasure coding. Erasure coding is based around the idea that it is possible to specify the total number of drives (or, in this case, cloud providers) that can fail without impacting the functionality of the file system.
Erasure coding uses the variables K and N. K refers to the number of providers required to be functional at any given time, while N is the total number of providers used. Hence, recovery goals can be expressed as K of N. Put into practice, each of your N cloud providers will store a volume of data that is equal to the total size of your data set divided by K.
To further illustrate this concept, let's examine the default Tahoe parameters in which K=3 and N=10. These values, which can be changed, specify that 10 different cloud service providers are being used, and that up to seven of them can fail at any given time. Conversely, three providers must remain online for the file system to remain functional.
Now suppose you needed to store 1 TB (1,024 GB) of data in the cloud (using the default Tahoe-LAFS parameters). Each of the 10 cloud providers will need to store enough data to insulate against the failure of any seven servers. The volume of data that must be stored on each server is the total size of the data set (1,024 GB) divided by K (3). In this case, that would mean that each of the 10 cloud providers would have to store approximately 341.3 GB of data.
It is important to consider what this level of reliability does to your storage costs. Cloud storage providers charge based on the volume of data being stored (some also charge for input/output). Using the example above, the redundancy requirements would triple the total volume of data being stored in the cloud (3,413 GB spread across 10 providers instead of 1,024 GB stored on a single provider).
Varying approaches to open source clouds
Open source cloud options expand
Dig Deeper on Public cloud storage
Related Q&A from Brien Posey
There are some strong PowerShell scripts that can provide backup capabilities. It's important, though, to be mindful of PowerShell's quirks so you ... Continue Reading
Bloatware isn't just annoying -- it can negatively affect OS security, for example. Find out ways to get rid of Windows 10 bloatware once and for all... Continue Reading
Is your Google data protected? Make sure you are backing up G Suite files, because Google doesn't provide the comprehensive protection you'll need to... Continue Reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.