- LDS Family History Department Adopts XML Standard At a technical session of the GENTECH2001 conference last week, Randy Bryson of the Church of Jesus Christ of Latter-day Saints (the Mormons) announced that the Church is now standardizing on the XML programming language for all future software products. This announcement will have an immediate impact on producers of genealogy software and eventually will benefit all genealogists. Mr. Bryson is the Director of the FamilySearch Internet Genealogy Service for the LDS Family History Department and also is the Information Technology Manager over the Ancestral File, Resource Files, Research Guidance and Extraction applications. As such, he is responsible for compatibility among these products. The de facto data exchange standard for many years has been GEDCOM, a file format that is well-known for its imperfections. GEDCOM, an abbreviation for Genealogy Data COMmunications, was created by the LDS Church in the mid-1980s as a method of exchanging genealogy data between different programs. The specifications for GEDCCOM file format have been updated a few times since then, and GEDCOM files have become the most common method of exchanging data between distant relatives. GEDCOM files also are used to contribute an individual's data to the large, centralized databases of the LDS Church and other organizations. In its first iteration, GEDCOM files consisted of ASCII text. Unlike binary files used by most other programs, you can open a GEDCOM file with a simple text editor and read the data contained therein. Later versions of GEDCOM were expanded to include ANSEL and Unicode, in addition to ASCII. Because of these updates, GEDCOM files can now handle umlauts and accents and other marks common in European alphabets. However, you can still read this data with a text editor, such as Windows Notepad. GEDCOM has always suffered from numerous shortcomings, one limitation being the use of text. Other limitations have included difficulties with handling non-European names, handling imprecise data, and also the method of handling contradictory data such as we all find in genealogy research. In the 1990s, two separate and exhaustive studies of exchanging data between genealogy programs were made. The two were conducted more or less simultaneously: 1. One study was the GEDCOM Testbook Project, funded by GENTECH. The results of that project are called "GEDCOM Interchange Study Summary" and are available at: http://www.gentech.org/testbook/summary.htm. The GENTECH effort later spun off a second, larger study, called the GENTECH Genealogical Data Model. While not dealing directly with the GEDCOM standard, it does address many issues that GEDCOM programmers need to be familiar with. 2. The other study was conducted by the Family History Department of the LDS Church. It resulted in the GEDCOM Future Directions document, published by the Family History Department, available at: ftp://gedcom.org/pub/genealogy/gedcom The two studies were different in scope and purpose. The conclusions and recommendations of the two were also somewhat different although similar in some ways. It is interesting to note that the XML standard was mostly unknown at the time these studies began but came into prominence before the conclusion of these studies. While XML was not cited as a specific recommendation in either study, I have since heard the authors of both studies make reference to XML as a possible solution to some of the shortcomings of today's methodologies. XML is an abbreviation for "Extensible Markup Language," a programming language that has become very popular for applications that function on the World Wide Web. If you have made airline reservations online or purchased other goods from an online merchant, you have probably used an XML-based application without realizing it. A discussion of XML is beyond the scope of this article. For reference, I would suggest you start at http://www.xml.com or with any of the many good books on the topic available at your local bookstore. I also should mention another alternative to GEDCOM's shortcomings: Wholly Genes Software created GenBridge, a different method of directly transferring data between different databases that does not use GEDCOM at all. While Wholly Genes has had great success with GenBridge, other software producers have not yet adopted it. Randy Bryson's announcement of the adoption of XML illustrates the LDS Church's concerns and plans. Obviously, the programmers at the Family History Department have read these two studies and are proceeding with some of the recommendations. The introduction of XML will increase accuracy as well as allow for the use of non- European characters. A future release of the GEDCOM standard will be XML-based. The LDS databases will also accept XML data, databases such as the Ancestral File, Pedigree Resource File, International Genealogical Index and others. My guess is that the commercial Internet genealogy databases (Ancestry.com, genealogy.com, OneGreatFamily.com, etc.) will also convert to XML input, perhaps even before the LDS Church completes their conversion. Obviously, all the genealogy programs used by individuals will also need to produce XML-formatted GEDCOM files in compliance with the new specification. I am sure we will see future versions of The Master Genealogist, Personal Ancestral File, Family Tree Maker, Family Origins, Legacy and other genealogy programs that will produce XML files, once the new GEDCOM replacement format has been defined. None of this exists today. Randy Bryson's announcement simply indicates a future course. I suspect it will be two years or even longer before the new XML format is in place and in use. However, the benefits will justify the wait. ========================================================== DISCLAIMER: This newsletter is being written and sent via e-mail at no charge. I expect to write one new issue on a more or less weekly basis. However, life sometimes interferes, and the need to earn a living may create an occasional delay. ========================================================== COPYRIGHTS: The contents of this newsletter are copyright by Richard W. Eastman with the following exception: Many of the articles published in these newsletters contain quotes or references from others, especially from other Web sites, software users manuals, press releases and other public announcements. Any words in this newsletter attributed to another person or organization remain the copyrighted materials of the original author(s). You are hereby granted rights, unless otherwise specified, to re- distribute articles from this newsletter to other parties provided: 1. You do so strictly for non-commercial purposes 2. Your re-distribution is limited to one or two articles per newsletter; do not re-distribute the newsletter in its entirety 3. You may not republish any articles containing words attributed to another person or organization until you obtain permission from that person or organization. While you do have permission to republish words written by Richard W. Eastman, you do not have automatic authority to republish words written by others, even if their words appear in this newsletter. Also, please include the following statement with any articles you re-distribute: The following article is from Eastman's Online Genealogy Newsletter and is copyright 2001 by Richard W. Eastman. It is re-published here with the permission of the author. Thank you for your cooperation. ========================================================== Subscription information: There are two different methods to subscribe to this free newsletter: Method #1: to subscribe, to cancel an existing subscription, modify an existing subscription in any way or to read back issues, go to: http://www.rootsforum.com/newsletter.htm Method #2: Send an e-mail to rootscomputing-subscribe@listbot.com Please feel free to copy this subscription information and pass it on to anyone else who you think might be interested in obtaining a free subscription. ========================================================== About the author: Dick Eastman is the forum manager of the three Genealogy Forums on CompuServe. He also is the author of "YOUR ROOTS: Total Genealogy Planning On Your Computer" published by Ziff-Davis Press. He can be reached at: richard@eastman.net _____________________________________________________________________ To unsubscribe, write to rootscomputing-unsubscribe@listbot.com ________________________________________________________________ GET INTERNET ACCESS FROM JUNO! Juno offers FREE or PREMIUM Internet access for less! Join Juno today! For your FREE software, visit: http://dl.www.juno.com/get/tagj.