Incremental Backups of Databases

an article added by: Ben Smeider at 11272007


Servers :: Incremental Backups of Databases ::

 French | Spanish | Portuguese | Italian | German | Japanese | Chinese | Korean | Russian | Arabic Bookmark and Share

In general, incremental backups are limited to filesystems, although some backup vendors do have technology that will permit the incremental backing up of databases. Specifically, to do an incremental backup of a filesystem, the blocks that have changed must be backed up. Once they are backed up, pointers and indices must be maintained so that the blocks can be put back into the database upon restore. Some solutions require a complete scan of the database for changed blocks. At least one solution (VERITAS NetBackup with the VERITAS File System, only available on Unix) keeps track of the blocks that have changed and does not need to do a full database scan in order to perform an incremental backup. As with all incremental backups, in order to perform a full restore, the tapes from the full and all of the incrementals (or only the most recent cumulative incremental tape) are required.

Shrinking Backup Windows

A backup window is the amount of time that systems can be affected by a backup. That used to mean the time that the systems were unavailable while backups were taken. Nowadays, systems can remain operational while backups go on, but they will likely suffer a performance impact unless special precautions are taken. The bottom line on backup windows is that they are getting smaller and smaller, approaching zero. In many shops, they already are zero; backups cannot cause any service interruption at all. Fortunately, even tiny backup windows are no excuse to give backups short shrift. In this section, we look at some of the techniques for shortening the duration of the interruptions caused by taking backups. Some of these techniques reduce the amount of data being backed up; others require additional specialized hardware or software to get their job done. Some techniques are specific to one vendor or another, and others work only on databases, or only on filesystems.

Hot Backups

The ultimate in backup windows is one that causes no interruption whatsoever. This can be achieved on databases, and can very nearly be achieved on filesystems. The biggest problem in achieving hot backups is one of data consistency. Since it takes time for a backup to run and complete, files or data can change within the data store (filesystem or disks) being backed up. In order for a hot backup to be successful, there must be a mechanism inserted between the data store and the application that may be writing there. Most good commercial backup utilities support online, or hot, backups of filesystems and databases. (Homegrown solutions generally do not.) To achieve reliable hot backups, the backup utility must interface, often at a very low level, with a utility specifically designed for the application or filesystem being backed up. For example, the Oracle database makes two different utilities available (depending on the version of Oracle you are running). The backup utility that writes data to tape (or other backup medium) must know how to speak with OEBU (Oracle Enterprise Backup Utility) or RMAN (Recovery Manager) and turn the data that those utilities provide into data that can be written to (and later read from) tape. MS-SQL, DB/2, Sybase, and Informix all have similar utilities to allow hot backups. In addition, some commercial backup products have their own agents that work with the published database APIs to enable hot backups without the need for utilities like RMAN or OEBU. Normally, during a hot database backup, the database is put into a state where database writes are locked out of the actual database and only written to a log. When the backup completes, the contents of the log are run as a batch job and are added to the database. There are some performance implications with this method, because of the batched nature of the additions. There is also usually some performance overhead associated with logging the changes. Hot database backups are very vendor-specific and are an almost constantly changing technology. They are generally quite reliable and so should definitely be considered when you are evaluating ways to shrink backup window requirements. As for filesystem backups, there is a consistent problem across both Unix and Windows filesystems. Filesystem backups are usually taken in two passes— a first pass that determines and backs up the file and directory layout, and a second pass that backs up the files and their contents. If a file changes between the two passes, inconsistent results can occur, but the most likely result is that the file will not be backed up properly.

Another problem will occur if an application requires that two or more files are consistent. If they change during the backup, different versions may be backed up as the backup utility walks through the filesystem, resulting in inconsistent versions on the backup tape. Filesystem backups vary by operating system. On Windows, there is an issue with backing up open files. If an application has a file open, the file is locked. A locked file cannot be accessed by any other applications. Obviously, in order to back up a file, a backup utility must access the file. If the utility cannot access (or lock) a particular file, it will timeout in its attempt to back up the file, and move on. When this occurs, the locked file will not get backed up at all. There are at least two utilities for assisting in the backing up of open Windows files: St. Bernard’s Open File Manager and Columbia Data Systems’ Open Transaction Monitor. Some Windows backup manufacturers integrate one of these products right into their software, while others deliver their own. These utilities operate in kernel mode, filtering I/O for calls to open or locked files. They keep a copy of the open file in cache memory and present that copy to the backup software. In this way, this software guarantees that a consistent copy of the file gets backed up. The open file management software may also be able to cache the entire filesystem in memory (or on another disk) and present a consistent and unchanging copy of the filesystem to the backup utility. On Unix, open files are not as much of an issue. A file that is written to disk can be backed up by reading it off the disk. If a file is open and being written to, then the disk copy of the file may not reflect its most recent contents, but your backup will get the most recent copy that is on the disk. Several products that do hot filesystem backups make multiple passes over the filesystem’s inodes to be sure that file sizes and modification times have not changed during the course of the backup; if they have, the files that were being written to are dumped to tape again. Of course, running backups while production is going on can have a discernible impact on overall system and network performance, so even if backups can be done online, we still want to keep them as brief as possible, which leads back to the earlier discussion of incremental versus full backups. There are other ways to shorten the duration of your backups, which are discussed next.

Have Less Data, Save More Time (and Space)

There are at least two ways to reduce the amount of data that gets backed up in a filesystem. Both involve removing older and less used data from the filesystem, and storing it someplace else.

legal disclaimer

Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.

related articles

1. File and Print Server Failures
Network Failures Networks are naturally susceptible to failures because they contain many components and are affected by the configuration of every component. Where, exactly, is your network? In the switch? The drop cables? Bounded by all of the network interface cards in your systems? Any of those physical components can break, resulting in network outages or, more maddeningly, intermittent network failures. Networks are also affected by configuration problems. Incorrect routing information, duplicate host...

2. Web and Application Server Failures
Web and Application Server Failures The bugs that can strike a database can also affect a web server. Of course, many web servers are part of client/server applications that query back-end database servers to service client requests. So, anything affecting the database server will have an adverse effect on the web server as well. However, there are many other places within the web server environment where things might go awry. There are many new places for bugs to crop up, including in the Common Gateway Interfa...

3. Your system fails because the operating system panics
Renewability Let’s say your system fails because the operating system panics. It reboots, restarts applications such as web servers and databases, and continues on as before the failure. What’s the probability of another failure due to an operating system panic? In all likelihood, it’s exactly the same as it was before the reboot. There are many cases, however, in which repairing a system changes the MTBF characteristics of the system, increasing the probability of another failure in the near-te...

4. Direct and Indirect Costs of Downtime
The Costs of Downtime The only way to convince the people who control the purse strings that there is value in protecting uptime is to approach the problem from a dollars-andcents perspective. In this section, we provide some ammunition that should help make the case to even the most stubborn manager. Direct Costs of Downtime The most obvious cost of downtime is probably not the most expensive one: lost user productivity. The actual cost of that downtime is dependent upon what work your user...

5. COST OF DOWNTIME IS NOT A CONSTANT
Further complicating matters is the fact that the cost of downtime is not a constant. We will assume it to be constant for the purposes of our calculations (it makes them much, much simpler), but in reality, the cost of downtime increases as the duration of an outage increases. Consider again the effects of downtime on an e-commerce site. If the site suffers a brief outage (a few seconds), the cost will be minimal, perhaps even negligible. An outage of a minute or less probably will not affect business too badly: All...

6. The Politics of Availability
To persuade others of the value of your ideas, it is necessary to delve into the dark, shadowy world of organizational politics. Fundamentally, this means that you achieve your goals by helping (or if you aren’t particularly scrupulous, appearing to help) others around you achieve their goals, so that they then help you achieve yours. Start Inside Probably the best way to convince others of the value of your ideas is to first convince them that your ideas will help them achieve their own goals. To do that, yo...

7. Rational case that explains in nontechnical terms
Start Building the Case Once you have learned what you need to know, the next step is to begin to put together a calm and rational case that explains in nontechnical terms what the vulnerabilities, risks, and costs are. The case must include a discussion of the risks of inaction. Find Allies Ask around your organization. Look for friends and colleagues who share your concerns. Maybe you’ll find someone who has tried to convince management of something in the past. At the very l...