PC Backups
Creating and using backups
Backups have two distinct purposes. The primary purpose is to recover data after its loss, deletion or corruption. The secondary purpose of backups is to recover data from an earlier time, according to a user-defined data retention policy.
Storage, the base of a backup system starts with a concept of a data repository. The backup data needs to be stored somehow and probably should be organized to a degree. It can be as simple as a sheet of paper with a list of all backup tapes and the dates they were written or a more sophisticated setup with a computerized index, catalog, or relational database. Different repository models have different advantages.
Storage, the base of a backup system starts with a concept of a data repository. The backup data needs to be stored somehow and probably should be organized to a degree. It can be as simple as a sheet of paper with a list of all backup tapes and the dates they were written or a more sophisticated setup with a computerized index, catalog, or relational database. Different repository models have different advantages.
Backup types
Unstructured: An unstructured repository may simply be a stack of floppy disks or CD-R/DVD-R media with minimal information about what was backed up and when. This is the easiest to implement, but probably the least likely to achieve a high level of recoverability.
Full only / System imaging: A repository of this type contains complete system images from one or more specific points in time. This technology is frequently used by computer technicians to record known good configurations. Imaging is generally more useful for deploying a standard configuration to many systems rather than as a tool for making ongoing backups of diverse systems.
Incremental: An incremental style repository aims to make it more feasible to store backups from more points in time by organizing the data into increments of change between points in time. This eliminates the need to store duplicate copies of unchanged data, as would be the case with a portion of the data of subsequent full backups. Typically, a full backup (of all files) is made which serves as the reference point for and incremental backup set. After that, any number of incremental backups are made. Restoring the whole system to a certain point in time would require locating the last full backup taken previous to the data loss plus each and all of the incremental backups that cover the period of time between the full backup and the point in time to which the system is supposed to be restored. Additionally, some backup systems can reorganize the repository to synthesize full backups from a series of incrementals.
Differential: A differential style repository saves the data since the last full backup. It has the advantage that only a maximum of two data sets are needed to restore the data. One disadvantage, at least as compared to the incremental backup method, is that as time from the last full backup (and, thus, data changes) increase so does the time to perform the differential backup. To perform a differential backup, it is first necessary to perform a full backup. After that, each differential backup made will contain all the changes since the last full backup. Restoring an entire system to a certain point in time would require locating the last full backup taken previous to the point of the failure or loss plus the last differential backup since the last full backup.
Reverse delta: A reverse delta type repository stores a recent "mirror" of the source data and a series of differences between the mirror in its current state and its previous states. A reverse delta backup will start with a normal full backup. After the full backup is performed, the system will periodically synchronize the full backup with the live copy, while storing the data necessary to reconstruct older versions. This system works particularly well for large, slowly changing, data sets. It differs from simple disk mirroring in that it enables a roll-back of the log and thus restoration of old image of data.
Remote backup service: As broadband internet access becomes more widespread, remote backup services are gaining in popularity. Backing up via the internet to a remote location can protect against some worst-case scenarios such as fires, floods, or earthquakes which would destroy any backups in the immediate vicinity along with everything else. There are, however, a number of drawbacks to remote backup services. First, Internet connections are usually slower than local data storage devices. Residential broadband is especially problematic as routine backups must use an upstream link that's usually much slower than the downstream link used only occasionally to retrieve a file from backup. This tends to limit the use of such services to relatively small amounts of high value data. Secondly, users must trust a third party service provider to maintain the privacy and integrity of their data, although confidentiality can be assured by encrypting the data before transmission to the backup service with an encryption key known only to the user. Ultimately the backup service must itself use one of the above methods so this could be seen as a more complex way of doing traditional backups.
On-line: On-line backup storage is typically the most accessible type of data storage, which can begin restore in milliseconds time. A good example would be an internal hard disk or a disk array (maybe connected to SAN). This type of storage is very convenient and speedy, but is relatively expensive. On-line storage is quite vulnerable to being deleted or overwritten, either by accident, by intentional malevolent action, or in the wake of a data-deleting virus payload.
Near-line: Near-line storage is typically less accessible and less expensive than on-line storage, but still useful for backup data storage. A good example would be a tape library with restore times ranging from seconds to a few minutes. A mechanical device is usually involved in moving media units from storage into a drive where the data can be read or written. Generally it has safety properties similar to on-line storage.
Off-line: Off-line storage requires some direct human action in order to make access to the storage media physically possible. This action is typically inserting a tape into a tape drive or plugging in a cable that allows a device to be accessed. Because the data is not accessible via any computer except during limited periods in which it is written or read back, it is largely immune to a whole class of on-line backup failure modes. Access time will vary depending on whether the media is on-site or off-site.
Storage Media Types
Magnetic tape: Magnetic tape has long been the most commonly used medium for bulk data storage, backup, archiving, and interchange. Tape has typically had an order of magnitude better capacity/price ratio when compared to hard disk, but recently the ratios for tape and hard disk have become a lot closer. There are myriad formats, many of which are proprietary or specific to certain markets like mainframes or a particular brand of personal computer. Tape is a sequential access medium, so even though access times may be poor, the rate of continuously writing or reading data can actually be very fast. Some new tape drives are even faster than modern hard disks. A principal advantage of tape is that it has been used for this purpose for decades (much longer than any alternative) and its characteristics are well understood.
Hard disk: The capacity/price ratio of hard disk has been rapidly improving for many years. This is making it more competitive with magnetic tape as a bulk storage medium. The main advantages of hard disk storage are low access times, availability, capacity and ease of use. External disks can be connected via local interfaces like SCSI, USB, FireWire, or eSATA, or via longer distance technologies like Ethernet, iSCSI, or Fibre Channel. Some disk-based backup systems, such as Virtual Tape Libraries, support data deduplication which can dramatically reduce the amount of disk storage capacity consumed by daily and weekly backup data. The main disadvantages of hard disk backups are that they are easily damaged, especially while being transported (e.g., for off-site backups), and that their stability over periods of years is a relative unknown.
Disk/Optical storage: Recordable CDs, DVDs, and Blu-ray Discs are commonly used with personal computers and generally have low media unit costs. However, the capacities and speeds of these and other optical discs are typically an order of magnitude lower than hard disk or tape. Many optical disk formats are WORM type, which makes them useful for archival purposes since the data cannot be changed. The use of an auto-changer or jukebox can make optical discs a feasible option for larger-scale backup systems. Some optical storage systems allow for cataloged data backups without human contact with the discs, allowing for longer data integrity.
Floppy disk: During the 1980s and early 1990s, many personal/home computer users associated backing up mostly with copying to floppy disks. However, the data capacity of floppy disks failed to catch up with growing demands, rendering them unpopular and obsolete.
Solid State Storage: Also known as flash memory, thumb drives, USB flash drives, CompactFlash, SmartMedia, Memory Stick, Secure Digital cards, etc., these devices are relatively expensive for their low capacity. A solid state drive does not contain any movable parts unlike its magnetic drive counterpart and can have huge throughput in the order of 500Mbit/s to 6Gbit/s. SSD drives are now available in the order of 500GB to TBs.