Link Rot in your Family Tree?-
A Case Study in the Dynamics of Internet Genealogy


This article first appeared in the July/August 1998 issue of the National Genealogical Society's Computer Interest Group Digest


By Mark Howells


Change is the Only Constant

The best thing about genealogical information on the web is that it can be changed so easily. The worst thing about genealogical information on the web is also that it can be changed so easily. This article will explore the transient nature of genealogy web sites. Using a small sample of genealogy web sites as a case study, changes which have occurred to the sample web sites over time will be examined. These changes will be categorized by the type of change. Finally, this article will discuss some positive steps which can be utilized by a genealogy web site author to minimize the impact of web site changes on their visitors.

The simple ability to update a family's history as ongoing research reveals new information is an extremely powerful tool. It gives every family historian the ability to publish the fruits of their research and, more importantly, to keep that information current. The use of the web as a publishing tool has freed us from the static confines of the printed word.

This power to revise and change a web site has attracted many family historians to the Internet as web site publishers - to the great benefit of us all. Never before have we had so much of other's research available at our fingertips. In the avalanche of personal web sites which provide genealogical information on the Internet, some interesting questions arise: Is the fluidity of web publishing a bad thing? Does the impermanence of web pages create a problem for genealogical research purposes?

The ability to modify a genealogy web site's contents is not the only issue involved in web site changes. Let's say that information displayed on your web site is of interest to a visitor regarding a suspected common ancestor. The visitor notes the URL of your web site in their research notebook as a possible lead and goes on their way. Some days or months later, the visitor wishes to revisit your web site to confirm the suspected lead and types in your URL. They enter in your URL and receive the message "URL Not Found" in reply. Where did your web site go? Did you move it? Did you change your site's file structure so that the old URL is now invalid? Did you stop paying your Internet Service Provider (ISP) bill and they removed your web site? Where did your web site go? When a web site changes URLs, the broken links to that web site's old URL on other pages are referred to as "link rot". Obviously, link rot or web site relocation can be as hazardous to Internet-based genealogy research as changes in web site content.

Pretty Pictures

In October of 1996, this author wrote an online article entitled Pretty Pictures. Its purpose was to highlight genealogy web sites which successfully used graphics in different ways to enhance a visitor's experience of each site. Although ancient in "web years", the article (with corrections) can be found online at Pretty Pictures.

There were a total of twelve web sites mentioned in the original article. Each had been selected for their successful use of graphics only - they were in no other way related to one another. All were authored by different individuals with different objectives for their sites. Several were hosted outside the U.S. Among the examples chosen were a USGenWeb state site, a USGenWeb county site from a different state, a genealogy society's web site, three country or area-specific general genealogy sites, and a state-wide online genealogy project site. The remainder of the sites were the personal genealogy pages of individuals. These sites represent a fairly diverse sample in regards to their purposes and objectives. Some provided links to other sites, some provided research information for particular areas, and some provided lineage information on specific families.

In February 1998, a New Zealander's e-mail directed the author's attention to the fact that many of the links to the twelve sites were broken. In other words, the URLs given in the original article no longer worked for one reason or another. Either the old links returned an error message stating that the URL could not be found or a web page was displayed with did not fit the article's description of that site.

All twelve links had been in working order when first published 16 months before. Obviously, the dynamic nature of the web had taken its toll on the currency of the links in the original article. The article clearly had a case of link rot. After thanking the observant New Zealander, the author began to investigate which links were broken and why they no longer worked. This case study developed out of that investigation.

What Changed?

In the table below is a summary of the sites mentioned in the original article. For each site mentioned, the original URL is provided along with a summary of what problem, if any, resulted when visiting the original link. Finally, the correct URL is provided where known.


Title:South Carolina GenWeb
Original URL:http://www.geocities.com/BourbonStreet/1783/
Problem:URL changed due to change in state co-ordinator. ISP re-assigned original URL to another individual.
New URL:http://www.geocities.com/Heartland/Hills/3837/

Title:Johnson Family Tree Home Page
Original URL:http://ourworld.compuserve.com/homepages/johnson_gen/
Problem:URL Not Found.
New URL: -

Title:Dutch Genealogy Links
Original URL:http://wwwedu.cs.utwente.nl/~hoitink/links.html
Problem:URL Not Found. Academic ISP changed domain name for hosting user web sites.
New URL:http://www.twente.nl/home/genealogy/links.html

Title:Viki's Little Corner of the Web
Original URL:http://www.novia.net/~vikia/
Problem:None. No change to URL or graphics.
New URL:Unchanged.

Title:History & Family Tree of John Jewell
Original URL:http://www.dcscomp.com.au/bruce/family/
Problem:URL changed. Left forwarding page at old URL.
New URL:http://www.dcscomp.com.au/jewell/family-history/

Title:Wayne County, Illinois GenWeb
Original URL:http://bl-12.rootsweb.com/~ilwayne/
Problem:Graphics changed.
New URL:Unchanged.

Title:Swedish Genealogy Page
Original URL:http://www.ts.umu.se/~petersj/swegen.html
Problem:None. No change to URL or graphics.
New URL:Unchanged.

Title:Kentucky Biographies Project
Original URL:http://www.starbase21.com/kybiog/
Problem:None. No change to URL or graphics.
New URL:Unchanged.

Title:Czechoslovak Genealogical Society International
Original URL:http://members.aol.com/cgsintl/index.html
Problem:URL Not Found.
New URL: - (see Author's Note below)

Title:Genealogy Corner
Original URL:http://www.geocities.com/Heartland/1657/genealogy.html
Problem:URL Not Found.
New URL: -

Title:Bill & Peggy's Genealogy (formerly Bill Holder's Home Page)
Original URL:http://www.rapidramp.com/bhold/index.html
Problem:DNS Entry Not Found. ISP is no longer in business or no longer uses the same domain name.
New URL:http://home1.gte.net/wh7262/index.htm

Title:Palatine & Pennsylvania Dutch Genealogy Page
Original URL:http://www.geocities.com/Heartland/3955/index.html
Problem:Graphics changed.
New URL:Unchanged.

To summarize the condition of the original 12 links after sixteen months, 7 links were broken, 3 links had not changed either URLs or graphics, and 2 links had not changed URLs but had changed graphics. Of the 7 links which were no longer functional, replacement URLs for 3 sites could not be found. These 3 links then were truly "dead". For the other 4 broken links, new URLs for the same site could be found by one means or another.

Sites in sample. 12
Sites that had moved URLs. New URLs found.4
Sites which "disappeared". No new URLs found.3
Sites with no changes.3
Sites whose graphics had changed.2

It is interesting to note that for the genealogy web sites in the sample, the most prevalent form of change was a change in URLs. Changes to web site content in the form of changes to graphics was the least common form of change. Since the web sites in this sample were originally selected for their graphics, a change in graphics makes a good proxy for measuring changes in other content.

The Mechanics Behind the Changes

A review of some of the specific web site URL changes is illuminating. The South Carolina GenWeb site's old URL is not technically broken. Following the old URL now brings up Tmot's Freedom of Speech web site. The new web site at the old URL has nothing to do with genealogy. What has happened here is very interesting. GeoCities attempted to encourage its users to better organize their web pages into appropriate communities. Genealogy pages were asked to migrate to the "Heartland" community so the South Carolina GenWeb co-ordinator gave up their former "Bourbon Street" address for a new "Heartland" address. GeoCities subsequently re-assigned the old "Bourbon Street" address to another user. Thus, when entering the original URL, the visitor is presented with a decidedly non-genealogical web site instead of South Carolina GenWeb.

The change in URL for Dutch Genealogy Links was for a different reason entirely. The site is hosted by the University of Twente in the Netherlands. The University changed the domain name under which their users' accounts are maintained. User accounts are now hosted under www.twente.nl rather than under wwwedu.cs.utwente.nl. So while the academic ISP which hosts the site remained the same, an internal re-organization of the University's domain names caused the original URL to go dead.

Re-organization is also the cause of the change in URLs for the History & Family Tree of John Jewell site. In this case, however, the change is not a result of a domain name change. The ISP hosting this site in Australia did not change domain names nor did the author move his site to another ISP. In fact, the directory structure on the ISP's server which hosts the History & Family Tree of John Jewell changed. The directory structure changed from /bruce/family/ to /jewell/family-history/ , thus changing the part of the site's URL to the right of the (unchanged) domain name. Since the author of this site is Bruce Jewell, it appears that the ISP may have changed their account holders' directory structure from a first name to a last name basis. This is most likely the result of growth in the ISP's business.

The History & Family Tree of John Jewell site is noteworthy for how the author of the site handled the change in his URL. When visiting the old URL, the following message appears:

Please note History and Family Tree of John Jewell has moved to the following address
http://www.dcscomp.com.au/jewell/family-history/
Please update your links.
Last Updated 13th November 1996

This "change of address" message resides at the old URL. No visitor should be confused by this clearly written message which includes a clickable link to the new URL where History & Family Tree of John Jewell now resides. How thoughtful of Bruce Jewell to put this message in place of his old URL! This helpful "we've moved" message should be standard practice for any web site author who changes their directory structure or moves ISPs. Bill & Peggy's Genealogy was formerly titled Bill Holder's Home Page but the title change does not relate to why its URL changed. This fun web site's URL changed because its author has changed ISPs - not once but twice. The error message returned by entering the old URL is "DNS (Domain Name Service) Not Found". This means that the Internet no longer recognizes the domain name given in the URL. Either the old ISP has gone out of business or they chose to no longer maintain their old domain name. Bill & Peggy's Genealogy was "found" again on the Internet using Cyndi's List of Genealogy Sites on the Internet. Bill Holder had updated his new URL information on Cyndi's List so that visitors can still find his web site. However, since changing URLs the first time, Bill has changed ISPs yet again. So at his first new URL - http://cust2.iamerica.net/wh7262/ - there is a "we've moved" message:

This Site has moved to a faster web provider.
It is now at
http://home1.gte.net/wh7262/index.htm

Once again, the web site author has thoughtfully provided a clickable link to the new URL. This time, even the reason for the second move is given in the message.

The Reasons for Change

The above four web sites were those in the sample which had changed URLs but whose new URLs were locatable. There were three other web sites in the sample which simply "disappeared" - they could no longer be located. The reason for a genealogy web site completely disappearing is self-evident. The author of the web site may loose interest in either genealogy itself or in sharing their research on the Internet.

Loss of interest or the inability to continue to maintain a site may lead to a URL change rather than a simple disappearance if the site in question is more than a personal genealogy site. As in the case of South Carolina GenWeb, a site associated with an ongoing project such as USGenWeb may change maintainers entirely. This change of "owners" may cause the site to vanish, only to reappear elsewhere.

Other reasons for a genealogy web site changing URLs are a bit more diverse. ISPs can go out of business. Web page authors may find another ISP provides cheaper service or, as in Bill Holder's case above, an ISP with a better level of service. Those individuals who create and maintain genealogy web sites may change their own URLs by re-designing their directory structure - either renaming web page files or moving web pages to new directories. Alternately, an online genealogist's ISP might restructure their domain name or their user accounts, either of which would trigger a URL change.

Minimizing the Impact of Change

As changes in genealogy URLs appear common, some ideas on how best to minimize the impact of these changes are in order. After all, the whole purpose of a genealogy web site is to share our information with as many interested parties as possible. Therefore, we should concern themselves with how to ensure that the maximum number of visitors possible can locate visit our sites - particularly in the event that we move web site locations.

Both Bruce Jewell and Bill Holder have provided good examples of "change of address" pages. Whenever possible, maintain a "we've moved" message page at your former URL when moving to a new URL. By doing so, you will be ensuring that your 4th cousin twice removed whom you have never met but who has the family Bible in their attic will still be able to find you online. At a minimum, your new URL should be provided to the visitor as a clickable link. "We've moved" web pages may also be written using a META tag in their HTML which generates a client pull. For details on the use of the META tag, see META - Meta-information , the section headed <META HTTP-EQUIV="refresh"...>. Client pull forces compliant browsers to go to your new URL after a preset period of time at your old URL. Using client pull allows you as a web page author to provide visitors to your old URL with a few seconds to read the "we've moved" message and then automatically redirects them to your new URL. The visitor is not even required to click on the link to your new URL - they are taken there effortlessly.

Web pages for online projects or organizations present special challenges when the individuals hosting those web pages change. As web space is usually provided by ISPs to individual rather than to group accounts, a change in hosts usually means a change in web site URL. This problem can be avoided by utilizing the web hosting services provided by the RootsWeb Data Genealogical Co-operative. RootsWeb offers no-or-low cost web hosting services to genealogical societies and to online projects such as USGenWeb. Such groups may host their web sites at RootsWeb and designate a web author responsible for their site. In the event that the web author for the group changes, as South Carolina GenWeb's did, RootsWeb can easily switch web site update access between individuals without requiring a change in the URL itself. With a service such as this available, there is no longer any excuse for a genealogical group's web site URL to change merely because the person responsible for the web site has changed.

When moving a web site from one URL to another, always be sure to inform the large link lists such as Cyndi's List of your new URL. Thousands of online genealogists use these sites daily to navigate the Internet. By ensuring that your URL is corrected on the link lists, visitors interested in your genealogical information will still be able to locate you. Locating "lost" sites on a genealogy link list was the method used to relocate Bill & Peggy's Genealogy site mentioned above. Besides informing the sites which specialize in indexing genealogy web sites, you should also submit your new URL to be spidered by the general purpose search engines.

Conclusions

This case study seems to indicate that changes in location are fairly common among genealogy web sites. Furthermore, the study indicates that web site content changes, as measured by changes to site graphics, do not appear to be as common as changes to web site locations.

Overall, changes to genealogy web site do not represent a significant problem. Only 3 of the sites in the sample were completely lost after sixteen months. That left 9 sites still available to the online genealogy community to use and benefit from. In addition, only two sites had their content of interest changed in that same period. Stated another way, 75% of the sites were still available and only 16% had their content changed after 1 year and 4 months. While the transient nature of web sites on the Internet is a fact of the medium, this does not represent a major impediment to using the Internet for genealogical purposes.

Author's note:

Since this article was written, the new URL for the formerly "lost" Czechoslovak Genealogical Society International has been "found". The site's former webmaster kindly provided the new URL which is http://members.aol.com/CGSI/index.html. The changed URL is apparently the result of the Society changing it's account name with their ISP.

In addition, the Swedish Genealogy Page has moved since this article was written to http://www.acc.umu.se/~petersj/swegen.html.

Furthermore, Bruce Jewell's Family Tree & History of John Jewell moved yet again first to http://members.dcscomp.com.au/jewell/family-history/ and then to http://www.jewell.asn.au/.

The former URL for the South Carolina GenWeb has been reassigned by GeoCities yet again and is now Jeff's Page of Garbage.


About the Author

Return to Genealogy & Technology Articles by Mark Howells

Return to Mark & Cyndi's Family Tree Return to Mark & Cyndi's Family Tree


Link Rot in your Family Tree? - A Case Study in the Dynamics of Internet Genealogy
Created & maintained by Mark Howells.
For re-publication information about this article, please send email to markhow@oz.net
Updated August 31, 2001
Copyright © 1998 - 2001 by Mark Howells