I keep hearing about the "speed of light" issue with regards to it being the limiting factor in determining performance for long distance SANs. If this is so, why is this not an issue for DWDM systems that go 600km when transporting all types of data traffic?
I suspect is important only for synchronous data replication but can you advise, expound and educate?
When you configure your applications to run in a DR solution where every write is transferred to the remote site, your two enemies are bandwidth and distance. During normal operation, read requests from the host do not pose any overhead to the replication solution since all read requests will be satisfied by the local storage subsystem. Read performance will be similar to a normal non-DR solution. During an application write however, the data needs to be replicated to the storage array at the remote site. The replication process may have an impact on application performance, especially when using synchronous replication.
You see, during "sync" data replication, EVERY write operation to the local storage subsystem needs to be transmitted over the Fibre Channel to the storage subsystem at the remote site BEFORE you get an "I/O Complete" message sent back to the host application. Therefore, the host application must wait while the data is written to the remote site and an acknowlegement comes back that it has been accepted.
Because Fibre Channel actually uses the SCSI protocol to "talk" to the underlying disk drives, it takes a couple of round trips over the wire for the data to be accepted. The protocol goes something like this for a write:
1. Ask for a Rec_ready from the remote site to receive data.
2. Wait for the response.
3. Send the data.
4. Wait for acknowledgment from the remote site.
That's four round trips across the optical link for EVERY write!
Now let's look at the physics of Fibre Channel optical transmissions:
Light travels at about 300,000KM per second in a vacuum over Fibre cables. Due to impurities in the glass, that gets reduced down to about 200,000KM per second. That gives us a "latency" of about 1 millisecond for every 200KM.
My experience has shown, given all of the above (round trips, protocol overhead, noise, retransmits) that you end up with around a millisecond latency for every 25 miles. So every 25 miles of data transmission using SCSI over Fibre Channel decreases your disk performance by around 1 millisecond.
If you look at a "best case" scenario of only 20 microseconds latency per kilometer, you get about 32.2 microseconds per mile. That's still a lot of latency for high performance applications.
For example, if your remote site is located 100 miles away, each write would add 4 milliseconds to every transfer. A normal write to local cache under normal circumstances is in the microsecond range.
So you are right, SYNC replication is a drag on performance for any application as distance grows. Since ASYNC data replication provides an immediate "I/O complete" to the application, distance is not a factor.
Editor's note: Do you agree with this expert's response? If you have more to share, post it in one of our .bphAaR2qhqA^0@/searchstorage>discussion forums.
This was first published in February 2003