Backup and Archive

From PGVWiki
Jump to navigation Jump to search

The data you collected are likely the result of hundreds (or thousands) of hours of work, your own or other contributors. Unlike transient documents, the expectation is that the fruits of this work will serve the next generations of researchers and interested viewers. One could look at the collected data as an archive, which could be visited perhaps 20, 50 or 100 years from now.

It is imperative that that data is preserved against all kinds of disasters. Disasters can be as simple as a disk crash, which will occur with certainty, given time; the web provider can go out of business; the database will become obsolete (this is a certainty as well, with much shorter time span than the disk crash). Possibly the biggest danger is fading data formats. Who can read today documents written in a very popular in 1985 text editor PC-Write? It is difficult (or more precisely impossible) to predict the future of standards like Gedcom, and one needs to be vigilant and always look into the future. Gedcom appears to be stable for many years, is wildly popular, and is reasonably verbose, all of which add chances that (unlike PC-Write format) it will be decipherable in 20 years. One should also consider handing the work to one's sucessor. What data will he/she need if tomorrow I am unable or unwilling to continue?

To begin with, general backup rules apply here: back up everything, automatically, with copies off-site. One could start with Wikipedia articles on Backup and Archive. In many cases your website provider does (more or less diligently) backups, and you may not have a control over it. Such procedures will, if properly applied, allow you to restore the current, live system, but not to prepare a migration unknown future.

Backing up and archiving

What are the data you need to backup and / or archive?

  1. The most valuable (or the most irreplaceable) are the results of genealogical research. They are candidates to both backup and archiving. They are stored in three (or possibly more) places:
    • Gedcom - textual genealogical data are stored in Gedcom file, or temporarily in a database but transferable to the Gedcom format file.
    • Media - digital images of photos and documents, and all other records in one of many multimedia formats are stored as separate files on disk, in a hierarchy of directories.
    • Originals (paper or other 'old' medium) of the material - records, photos, documents, etc., stored in a physical repository or in many of them.
  2. Difficult (but not impossible) to replace are the data on users of the system, their preferences and details, configuration and options, etc. Those need to be backed up, but a archiving is also a welcome option, as the future researcher may be interested in the users as well
  3. Easiest to restore (but still may need some work) are the data used by various external plug-ins and modules. GoogleMap module may be an example here, where you might have added a lot of geo-data with your preferred method of identifying places. Back it up; archiving is probably not very important.

PhpGedView has several tools for exporting and backing the data.


((Work in progress)