Debunking Dedupe and Databases

When I talk to customers or attend conferences, I spend lots of time on Tegile’s unique implementation of deduplication. In database-centric discussions, I get push back quite a bit. Many say “Rubbish! Dedupe doesn’t do squat in a database environment!” While myopically true, here is my rebuttal (that usually wins):

1) Concede that in a single instance production database, dedupe does not get you much capacity savings.

OK, OK – I agree. In a single instance production database, dedupe does not get you much capacity savings. There is not that much redundant data there and most organizations are willing to spend on top tier performance in their mission critical and customer facing databases and applications. We sell our T3800 into these environments quite well and our customers are very happy with their results. Customers can actually turn off dedupe in our systems on a volume granulityar basis, but none of them do (that I know of).

Step 1) Concede

2) Open the discussion to test, development and QA instances of the database.

Most organizations I have met with to discuss database optimization have between 4 to 10 instances of their database behind the production curtain. These instances exist only to support the integrity and advancement of the production database. I have not seen many organizations willing to spend the dollars on production-class gear to support these back-office functions. AH-HA! “So, if I were to deliver a storage system that could eliminate data redundancies across these 4-10 instances, yielding a 25% to 90% reduction in capacity requirements, would you be pleased?” I always get a resounding YES to this question.

Step 2) Find redundant instances

3) Throw some nectar on my dedupe position to make it that much sweeter. After getting that resounding YES, I ask a followup question: “What if this back-office storage system that stores test, development and QA runs on the exact same architecture and microcode as your production system?” This would reduce risk of any storage infrastructure driven oddities stemming from moving from a lower-cost system to a high-end latency optimized system. I think you know what the answer is: another resounding “YES.”

Step 3) Pour Nectar

While I know there are many other factors to consider when evaluating storage infrastructure for databases, I encourage you to not be so dismissive about dedupe in your database environment. Want to learn more? Come learn more at Oracle Open World’s” Scene and Be Heard Theater” – Moscone South Exhibition Hall – Booth 313 on Tuesday, October 1st at 12:30 – 12:50.

Leave a Reply

Your email address will not be published. Required fields are marked *