Greg Blomberg - Fotolia
Data Dynamics CEO Piyush Mehta admitted he could not have envisioned the customer challenges his company would need to address as it marks its six-year anniversary.
The Teaneck, N.J., vendor focused on file migration when Mehta founded the company in 2012. But the Data Dynamics StorageX software has since added capabilities, such as analytics and native Amazon S3 API support, to help customers better understand and manage their data as they transition to hybrid cloud architectures.
The StorageX software enables users to set policies to move files from one storage system to another, including on-premises or public cloud object storage, and to retrieve and manage their unstructured data. New features on the horizon include container support, NFSv4 namespace capabilities and greater integration with Microsoft's Active Directory, according to Mehta.
Mehta said Data Dynamics StorageX currently has more than 100 customers, including 25 Fortune 100 companies and six of the world's top 12 banks. He estimated more than 90% of customer data is stored on premises.
StorageX actually goes back to 2002, when it was developed by startup NuView for file virtualization. SAN switching vendor Brocade acquired NuView in 2006 and tried to use StorageX for a push into storage software. After Brocade's software strategy failed, it sold the StorageX intellectual property to Data Dynamics, which relaunched the software as a file migration tool in 2013.
In a Q&A, Mehta discussed the latest customer trends, upcoming product capabilities and recent pricing changes for Data Dynamics StorageX software.
How has the primary use case for Data Dynamics StorageX software changed over the one you initially envisioned?
Piyush Mehta: What ended up happening is, year over year, as customers were leveraging StorageX for large-scale migrations, we realized a consistent challenge across environments. Customers lost track of understanding the data, lost track of owners, lost track of impact when they moved the data. And we realized that there's a business opportunity where we could add modules that can help do that.
Think of this as ongoing lifecycle management of unstructured data that just keeps growing at 30%, 40%, 50% year over year. The second aspect to that was helping them move data not just to a NAS tier, but also to object storage, both on- and off-prem.
What are the latest customer trends?
Mehta: One theme that we continue to see is a cloud-first strategy regardless of vertical; every company, every CIO, every head of infrastructure talks about how they can leverage the cloud. The challenge is very few have a clearly defined strategy of what cloud means. And from a storage perspective, the bigger challenge for them is to understand what these legacy workloads are and where they can find workloads that can actually move to the cloud.
For born-in-the-cloud workloads, with applications that were started there, it's an easy starting point. But for the years and years of user and application data that's been gathered and collected, all on-prem, the question becomes: How do I manage that?
The second thing is a reality check that everything's not going to the public cloud, and everything's not going to stay local. There's going to be this hybrid cloud concept where certain data and certain applications will most likely -- at least for the foreseeable future -- reside locally. And then whatever is either not used, untouched, deep archive, those type of things can reside in the cloud.
Are customers more interested in using object storage in public or private clouds?
Mehta: It's a mixture. We do see huge interest in AWS and Glacier as a deep archive or dark data archive tier. At the same time, we see [NetApp's] StorageGrid, [IBM Cloud Object Storage, through its] Cleversafe [acquisition], Scality as something that customers are looking at or have deployed locally to tier large amounts of data -- but, again, data that's not necessarily active.
Do you find that people are more inclined to store inactive data than implement deletion policies?
Mehta: I still haven't seen the customer who says, 'It’s OK to delete.' You'll have the one-off exceptions where they may delete, but there's always this propensity to save, rather than delete, because I may need it.
What you end up finding is more and more data being stored -- in which case, why would I keep it on primary NAS? No matter how cheap NAS may be getting, I'd rather put it on a cheaper tier. That's where object conversations are coming in.
Which of the new StorageX capabilities have customers shown the most interest in?
Mehta: We have seen huge interest, adoption and sale of our analytics product. Most customers don't know their data -- type, size, age, who owns it, how often it's being accessed, etc. We've been able to empower them to go in and look at these multi-petabyte environments and understand that. Then, the decision becomes: What subset of this do I want to move to a flash tier? What subset do I want to move to a scale-up, scale-out NAS tier?
Then, there is what we call dark or orphan data, where a company says, 'Anything over 18 months old can sit in the active archive tier' -- and by active, I mean something that's disk-driven, rather than tape-driven. That's where we're seeing object interest come in. First, help me do the analytics to understand it. And then, based on that, set policies, which will then move the data.
Does Data Dynamics offer the ability to discover data?
Mehta: We have an analytics module that leverages what we call UDEs -- universal data engines. In the old world, when we were doing migrations only, they were the ones that were doing the heavy lifting of moving the data. Now, they also have the ability to go collect data. They will go ahead and crawl the file system or file directories and capture metadata information that then is sent back into the StorageX database, which can be shared, as well as exported. We can give you aggregate information, and then you can drill on those dashboards, as needed.
Does your analytics engine work on object- and file-based data?
Mehta: It works only on file today. It's really to understand your SMB and NFS data to help determine how to best leverage it. Most of that data -- I would say north of 95% -- is sitting on some kind of file tier when you look at unstructured data. It's not sitting on object.
Where is StorageX deployed?
Mehta: The StorageX software gets deployed on a server within the customer environment, because that's your control mechanism, along with the back-end databases. That's within the customer's firewalls. From an infrastructure standpoint, everything sits in central VMs [virtual machines]. We're moving it to a container technology in our next release to make it far more flexible and versatile in terms of how you are scaling and managing it.
What other capabilities do you think you'll need moving forward?
Mehta: More integration with Active Directory so that we can provide far more information in terms of security and access than we can today. From a technology standpoint, we are continuing to make sure that we support API integration downstream into local storage vendors -- so, the latest operating systems and the latest APIs. Then, from a hyperscaler standpoint, being able to have native API integration into things like Azure Blob and Glacier are things that are being added.
Data Dynamics updated StorageX pricing this year. There's no longer a fee for file retrieval, but the prices for analytics, replication and other capabilities increased. What drove the changes?
Mehta: The costs haven't gone up. Before, we were giving you a traditional licensing mechanism where you had two lines items: a base license cost and a maintenance cost. That was confusing customers, so we decided to just make it one single line item. Every module of ours now becomes an annual subscription based on capacity, where the cost of maintenance is embedded into it.
The other thing we learned from our customers was that when you looked at both an archive and a retrieval capability, we wanted customers to have the flexibility to manage that without budgeting and worrying about the cost constraints of what they were going to retrieve. It's hard to predict what percentage of the data that you archive will need to be brought back. The management of the 'bring back, send back, bring back, send back' becomes a huge tax on the customer.
Now, the functionality of retrieval is given to you as part of your archive module, so you are not paying an incremental cost for it. It became subscription, so it's just an auto renewal, rather than worrying about it from a Capex perspective and renewing maintenance and all of that.