Compare Data Domain
Traditional VTLs: VTL-centric designs
Tape library protocols pre-exist backup-to-disk systems and are used between physical tape libraries (PTLs) and backup software. Some vendors use these protocols to create a disk-based product that could solve some known problems with tape1, but they did not assume that tape would go away, they assumed disk would provide a front end cache to PTLs. The result: Virtual tape libraries (VTLs) used as a cache for backup data on the way to tape.
Data Domain also offers a VTL protocol option, but it is within a deduplication-centric system design, and the differences with a more VTL-centric design are clear.
- VTL-centric designs are built to augment tape, not minimize it. In a very big site, it can be seen as a risk to change quickly, even when minimizing tape is known to be a goal. VTL-centric systems serve this market. The effect is that they are not well positioned for cost effective nearline retention or replication for disaster recovery (DR). In some cases, they can scale to be enormous, befitting these customers. But if you want to stop using tape for DR, or restore from disk most of the time and you worry about capital costs, this is not the approach for you.
- VTL-centric designs support backup only. The VTL interface is not supported by many non-backup nearline applications, such as email- or file-archiving. A VTL-centric design will usually be offered only for backup support, not for other nearline storage applications.
- Bolting on dedupe to a VTL-centric design: the seams show. Building a big, scalable disk cache is very different from building a consolidated dedupe store. One scales by striping data across a lot of drives; the other scales by CPU-centric conservation of disk. One is built for copy to tape, one is built for inline dedupe replication to minimize time to DR. You can put both in the same sheet metal, but that doesn't make it a system. The bolt-on vendor has two things to design, maintain, and manage leading to difficult policy setting, performance peculiarities, resilience design gaps, etc.
In a Data Domain system, deduplication is integrated and fundamental, so any data written from any client across any combination of interfaces (including VTL) will all be deduplicated across all prior data. It can support any nearline application class, including backup, recovery and archiving. Deduplicated replication is integrated and flexible. And it can compete economically with the TCO of tape, because tape minimization is at the heart of its design.
How to identify a VTL-centric design:
- VTL-centric designs do not natively replicate deduped data
- VTL-centric designs will struggle with design tradeoffs that stem from needing to add dedupe to an existing design. It is hard to make the resulting system simple. Look for separate policy management for different types of volumes, redundant versus deduplicated. Also, in some cases, look for having to manage each file type separately, due to the lack of generality in the approach to dedupe.
- VTL-centric designs typically promote the ability to copy directly to a PTL, using the VTL as the data mover, rather than using the backup software. Tradeoff: losing control of the backup software catalog.
1 E.g., LTO4 tapes require much more input speed to stream (go at rated speed) than most servers can offer; by collecting backup images onto a disk stage first, they can be copied to tape much faster and stream the drive.
