In a modern Backup & Recovery infrastructure backup-to-disk solutions are used very often. But many customers need to look at capex and opex. With the implementation of modern deduplication technologies these expenses can be decreased very easily. However, to save money, you have to follow some rules, otherwise the backup-to-disk solution will not work as efficient as expected. The most important rules and a comparison of virtual and physical tape drives are listed below. More information and best practice tips can be found in “StoreOnce User Guides”.
Basics:
- Physical and virtual tape principles are very different.
- Physical tape receives data that is optimized for a sequential data stream and writes the data in a serial format to a sequential target device.
- Virtual tape receives data in a random stream and to achieve top performance writes the data in a random format to a random access target device.
- Therefore the methods you must use to optimize these two are very different types of devices and must be very different to achieve best performance. The methods used for physical tape will not work for virtual tape.
Optimize for physical tape:
- Multiplexing and a high concurrency (amount of clients writing to the same tape at the same time) are the best to increase the data transfer rate to the tape drive.
- A high concurrency and a good data transfer rate prevents the tape to start and stop, known as “Shoe-Shining”.
Multiplexing:
Optimize for virtual tape:
- For a good deduplication you must not use Multiplexing, so the concurrency must be set to “1” for the tape. Using these rules the data stream is optimized to recognize and remove redundancy in data stream (=deduplication).
- To increase the performance Multistreaming and as many virtual tape drives as needed must be used. As the tape is virtual you do need to worry about resources.
Multistreaming:
Good article. After a long battle with slow backups and slow object copies I figured this out myself. Article would have been even better if it would address what is the best way to connect drives, e.g. in a FC fabric. Physical and Virtual. In my experience, the way you physically connect them can make a huge speed difference. Keywords: Speed vs. redundancy. LUN mapping. Maybe in a follow-up article?