In vSphere 4.0, VMware introduced an introspection and filtering tool attached to the virtual SCSI interface called...
the vSCSI filter, which acts as a transport layer to send SCSI data to a virtual machine. With the vSCSI filter, that virtual machine can act on or operate upon the SCSI data.
The uses for this type of filtering range from data loss prevention, anti-malware and anti-virus, as well as replication techniques for running VMs. And there are other uses of the vSCSI filter being developed, such as for decryption of disk data.
What makes this filter work is its high performance for reading and writing SCSI blocks, which is the key to virtual disk technology within the vSphere environment. Now, with the vSCSI filter, we can capture those reads and writes to meet other needs. VMware first made use of the vSCSI filter in its vShield Endpoint product, which detects and isolates anti-virus and anti-malware traffic. VMware soon added a companion product, vShield App with Data Security, which uses the vSCSI filter to look for sensitive data -- within the virtual disk out-of-band with existing virtual disk accesses from the guest operating system -- that should not belong within the target VMs, such as personal identifiable information.
While the first uses of this technology were for routine security chores, more interesting uses have come along. For instance, the vSCSI filter is used in real-time replication as a tap on the SCSI subsystem of a running VM so that changed blocks can be written to an external target such as a cloud-based replication receiver. The first product with this technology was Zerto’s Zerto Virtual Replication, which won TechTarget’s Best of Show award at VMworld 2011 (TechTarget also publishes SearchVirtualStorage.com). VMware implemented vSCSI filter for real-time replication in its Site Recovery Manager 5 Replication product.
How the vSCSI filter works
The vSCSI filter taps the SCSI blocks as they are read and written by a VM, and it enables you to send your own commands to the SCSI subsystem. In essence, you can read through the entire virtual disk without the VM even being aware of it. Let’s look at how replication with the vSCSI filter would work as an example:
- Replication starts.
- Replication software sends commands down to read the entire virtual disk.
- Replication takes these initial reads of the entire virtual disk and transports them to the replication receiver using deduplication techniques.
- Replication switches to a tap mechanism to record only blocks written by the VM.
This type of replication eliminates much of the costly ongoing bandwidth charges, since only the changed blocks will be replicated on an ongoing basis. In addition, the vSCSI filter is independent from the VM and so doesn’t burden the VM with replication chores. Only the ongoing writes are recorded over the wire, limiting the bandwidth requirements.
These are just some uses of the vSCSI filter technology within vSphere. Other possible uses include out-of-band encryption/decryption of SCSI blocks and perhaps even the destructive overwrite required for compliance with various government regulations.
It’s important to note that vSphere did not blaze the trail enabling this type of replication and introspection. Before vSphere’s vSCSI filter, Microsoft’s Hyper-V had the ability to plug different storage drivers into the stack at appropriate points so that SCSI blocks had to be written through a vendor- created driver, such as from Virsto, thereby allowing similar replication and introspection.
But there are significant differences between vSphere’s and Hyper-V’s technologies. With Hyper-V, all the activity happens within the kernel, which has a static number of cores associated with it; with vSphere, on the other hand, the vSCSI filter transfers data to a VM, which can be assigned any number of cores and memory to do the required work. vSphere has a true SCSI offload mechanism.
With the ability to not only tap the SCSI subsystem but control it, the future is bright for this type of technology, which can possibly be used to address other security and data protection issues, such as disk encryption and secure data erase upon storage arrays. While not everyone needs this level of security, it certainly will help some IT organizations.