alphaspirit - Fotolia
The number of scale-out storage systems available from vendors has increased in the last two years with offerings for block storage and NAS systems, as well as a few that support both access methods simultaneously (generally called unified storage). Just like any other storage system buy, a scale-out architecture purchase is usually done through a trusted reseller who provides information about system capabilities and characteristics regarding performance. However, scale-out storage systems differ in a few ways from the more traditional dual-controller storage systems typically deployed in small and medium-sized business environments.
If you are new to scale-out architecture, here's a basic overview:
- A node is the basic unit of scaling in a scale-out system. Scaling is accomplished by adding nodes that contain the processing element running the embedded storage software, some number of interfaces and storage devices. A node may have one or two controllers. If the scale-out system evolved from a dual-controller storage system where high availability is achieved by one controller taking over for the one that may have failed, it will likely have dual-controller nodes. If the scale-out system was designed as scale-out, a node will likely have a single controller. In this case, if one node fails, high availability is achieved by the workload and data access being taken over by other nodes in the scale-out system. This is usually referred to as an N+1 architecture.
- Scale-out storage systems can vary in the way they ensure data availability. Some will make replica copies of data to other nodes -- usually two, then manage the replication of data for coherency. Others will have shared access to storage devices to enable host access to data through a surviving node in case one node fails. This difference can affect the usable capacity of a system, which becomes evident in the price for a given amount of capacity. As with any storage system, data reduction (deduplication or compression) can make a dramatic difference in the cost of a scale-out system.
- Scale-out systems can vary in key ways. Some systems allow another node to be connected, then data to be automatically redistributed to balance operations. This mechanism may also be used when a node fails and is replaced, and it works when a node is being upgraded. More advanced systems use a separate link between the nodes, called a back-channel interconnect, to move the data and for other communications.
- Routing of a request that comes to one node to access data that is on another node can be very different as well. Some systems may transfer the data from the node that owns it (where it is stored) to the node from where the access request came. Others may route the input/output request to the other node over the network using some address redirection (there are several methods, depending the type of storage network).
Your vendor will have guidelines about performance on a per-node basis and the effectiveness of scaling with multiple nodes. This is the best method to start with when sizing the system -- deciding how many nodes are needed in the configuration. As demand for capacity increases, nodes with the required capacity should be added to maintain the ratio of capacity to performance. Causing a performance problem as capacity is added has been a regular problem in IT.
How scale-out architectures improve performance
How do scale-out and object storage architectures compare?
Dig Deeper on Data center storage
Related Q&A from Randy Kerns
Compare SAN and NAS, and find out what to consider when using each storage system format. Object storage and the cloud are also affecting the storage... Continue Reading
Logical unit numbers are a logical abstraction between a physical disk device and applications. Learn more about LUN use cases and LUN security ... Continue Reading
What is the one hidden gotcha that you'd advise users about if they were shopping for an all-flash storage array? Continue Reading