Backing up multiple terabytes of data
23 Apr 2002, Rating 3.50 (out of 5)
Recently, the Storage Networking Industry Association (SNIA) demonstrated how to
achieve today's state-of-the-art of backup speed, one terabyte per hour!
At the Storage Networking Technology Center in Colorado Springs, demo teams
carefully divided the data into 16 equal volumes of 64G bytes and used large
tape libraries with sophisticated data movers and the fastest tape drives
available today. Even so, few teams were able to reach the necessary backup
rate. In order to achieve the goal, it is necessary to move data from disk to
tape at the impressive sustained rate of 350M bytes per second.
But, if the technology allows just 1T byte per hour backup, how is it possible
to backup a site with six to 10T bytes every day? Even incremental backups take
a long time just to search for changes when talking about these amounts of data.
Today, it is very common to find sites with multiple terabytes, even midsize
companies. On the other hand, the value of the data and the cost of data loss is
constantly raising suggestions that backups should be performed even more
It is evident that the gap between backup speed and the amount of data is
growing exponentially, while the frequency that this data needs to be saved is
What would happen in two years?
For companies that have this problem today, it will become a critical. For many
others, it will become a new problem. The solution needs to come from
technologies other than those used today for backup.
Backup is mostly used to prevent data loss in the following scenarios:
1. Storage device failures.
2. Site failures and disasters.
3. Software or human errors.
In the first two scenarios, redundancy of the hardware (i.e. disks, controllers,
HBAs, fabrics) provides the most cost effective solution. The downtime or data
loss costs are, in environments with multiple terabytes of data, much higher
than the cost of the hardware redundancy.
But hardware redundancy doesn't prevent data loss when a virus, hacker, or
software or human error occurs. For example, if a database is corrupted, it will
also be corrupted at the mirror site, even if completely mirrored sites are
The proposed solution in the third scenario is to use multilevel snapshot
technology. This makes it possible to create almost instant virtual copies of
data at multiple points in time, without needing to move data. Multilevel
snapshot, combined with hardware redundancy, provides a very scalable solution
for protecting data in multiple terabyte environments. Due to the fact that a
snapshot copy is created within seconds no matter what amount of data, it is
possible to drastically reduce the potential cost of data loss and downtime by
being able to do more frequent backups.
Today storage virtualization companies have developed capabilities for multiple
points in time snapshots copies, while backup companies have the capability to
manage catalogs of datasets that reside on tape cartridges.
What will solve the problem of the backup window?
Adding technology to backup software packages that will manage dataset catalogs
residing on snapshot copies. In this manner, the backup software will be able to
uniformly manage datasets that are placed in snapshot copies and/or in tape
cartridges and intelligently move them from one to the other.
Once enterprise backup software vendors include storage virtualization
techniques in their products, it will be possible to create a completely
scalable solution for backing up multiple terabytes of data that can be used by
the fast growing list of customers that enter into the backup window problem.
About the author: Nelson Nahum is a co-founder of StoreAge Networking
Technologies and has been its Chief Technology Officer since the company's
inception in April 1999.