The technical details on how deduplication works are beyond this post (and, although I have a basic understanding, probably beyond the author as well), but its usefulness and benefits are not. NetApp pioneered the use of storage efficiency techniques such as deduplication, compression, replication and cloning during the 1990s and these have been very well adopted from then on. Matt Watts, NetApp Director of Strategy and Technology is fond of saying that the CTOs he meets tell him they chose NetApp for the storage efficiency capabilities we offer but continue choosing NetApp for a host of other value added benefits. One of the key technologies is deduplication and whilst the architectural approaches vary – source side, target side, in-line, post-processing and so on, the goal is the same – to store less data.
As NetApp has a single operating system, Data ONTAP running across all its shared infrastructure products – 2000, 3000 and 6000 series, we are able to run a common monitoring service and get a lot of data on how our systems are being used in the field. A feature called ASUP, which stands for “AutoSupport”, tracks usage and sends back the stats to us. It isn’t a big brother capability but allows us to do preventative and interventionist work to the benefit of our customers. The majority of our customers turn on this tracking facility (it need to be initiated by them) – about 90% in fact. Some industries and government institutions don’t allow it and that is fine, it is purely voluntary.
What we find from the amassed data is just how much storage space is being saved by our storage efficiency techniques. There’s a real-time monitor up on our web site that keeps a live ticker of how much storage space we are saving for those customers who have turned it on. At the moment it has saved over 9 Exabytes for them in total. One can then extrapolate how much money we have saved industry – its billions of dollars in theory.
That’s great news for those who have switched on the dedupe capabilities within Data ONTAP. However, although these storage technologies have been around for many years it is still surprising that not every customer has. It is a no-cost exercise. The overhead of deduplication is minimal and the benefits significant.
I was recently at an IBM N Series customer event. N Series is IBM’s OEM’ed storage platform which comes originally from NetApp (hence the ‘N’ designation). My co-presenter, Ian Shave who is the WW storage business line manager outlined the success of the 7 year relationship in his keynote. 40,000 systems shipped, 500+PB of capacity delivered and one of the fastest growing product lines in the IBM storage portfolio. It has been a very successful relationship and IBM customers are getting a compelling value proposition with the combination of IBMs portfolio of server, software and services technologies with NetApp’s market leading storage solutions. It’s a success story for both companies.
However one of the stats he also revealed was that across this very significant estate, only 40% of customers have turned de-duplication on. Those that have are reaping tremendous efficiencies but that seemed a low figure to me. He asked the assembled audience of N Series customers who were using de-dupe and sure enough it was approximately 40% – so very consistent.
Despite being more than a decade old, storage efficiency techniques still have a lot to offer. With the inexorable growth of data and shrinking IT budgets, it is clearly ‘low-hanging fruit’ to avoid runaway expense of more and more storage. As our co-founder, Dave Hitz is famous for saying – we want you to buy less storage – that’s why we have developed the most sophisticated efficiency technologies in the industry. It’s just that when you do buy storage, buy it from us! See post from a couple of years ago, here.
So, whilst deduplication may seem “table stakes” for a storage vendor, there is still a lot to be done in educating the customer base on its utility and ultimately its benefit to the capex budget. Not all dedupe solutions are equal as well. For example, Data ONTAP can deduplicate clones – what we call compounded data storage efficiency. No other vendor can do that. As is often the case, the devil is in the detail!