RootsWeb.com Mailing Lists
Previous Page      Next Page
Total: 7620/10000
    1. Re: GEDcOM as a database format
    2. Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote: > On Sat, 22 Dec 2007 16:46:38 GMT, JD <jd4x4@<del.this>verizon.net> > declaimed the following in soc.genealogy.computing: > >> Except that XML is specifically extensible. Meaning that at SOME >> point the user or software would be required to handle "extra" data >> that may be an enhancement, or partially relevant/not relevant to the >> software and user's task. Having that built-in to the file format >> (and therefore the software) is a HUGE step forward, imo. >> > And one way of "handling" that extra data is to simply ignore > unknown tags. No different than current GEDCOM. It also means that the I personally wouldn't want software that ignored something and didn't at least provide me with a means of dealing with it. But if it did, as you say it can be transformed with xslt, which is exactly what I did with my publishing source data. I had several sources each with slightly differing source schemas, but all of the data was related and I created my own "final" schema from them, for the use I required. And, the software allowed my data to be exported with my schema, which could then (if desired) be again transformed back into the two original source schemas. And I didn't have to consult multiple sources to find out for sure if source A widgets were the same as source B widgets because I knew the definition as well as the context from the respective schema. > XML would not have a doctype and can not be used by a validating > parser. If it has a defined doctype than it has probably gone through my dtp software allowed me to define my own schema, validate it, > some standardization process and the client programs have been coded > to make use of that "standard". See the above. The "standardization" you refer to doesn't mean that the data has been changed, only reordered.. to YOUR schema. Software doesn't have to do that alone. YOUR schema can be YOUR ordering of the data. > > Yes, XML is "extensible"... But in practically all commercial > software, that "extension" ability is achieved by first creating a > doctype that defines what all the VALID entities will be for an > application. Parsers then reference that doctype, and maybe XSLT > transformation rules to change the presentation of the data. > > While an XSLT transform may be able to work without needing an > external doctype reference (as long as both the XML document and the > XSLT rules were created in parallel) these transforms are typically > from one representation of the data to another. Where I work, we have > an XML doctype used to define data types, interface methods, etc... > The transform engine (currently uses Oxygen8) with a set of XSLT > templates can take the XML to generate either an Ada main program and > linkage source code, or generate a structured (docbook) set of > documentation. Neither operation will succeed fully unless all tags in > the XML validate against the doctype specification file. > And should you be in a position (as I was) to need a third application for the data that would have a somewhat different output... you could then define your own schema and validate against it. > XML is, in effect, a language for defining application specific > mark-up languages. As such, the end "language" is only useful if all > sides agree on how to handle the content of the tags. > Not true. Both sides only need to agree on what the most descriptive, accurate, and useful rendering of the DATA for the content is. >> Data from a multitude of good sources (along with bad sources) is on >> the web. Eventually ALL of the data from good sources will be tagged >> using XML tags at some point. Imagine good data, properly tagged, >> able to be read by your software and auto-tagged with it's source >> info, and PASSED ON because the file format supports it. > > So far as I know, even current GEDCOM supports source citing -- > but > the users of the programs are not rigorous in doing that. So we don't need to perpetuate that by not providing a mechanism that could facilitate automation, imo.

    12/22/2007 09:33:37
    1. Re: GEDOM as a database format
    2. Wes Groleau
    3. Ian Goddard wrote: > The problem is that the unknown tag is part of the structure and has to > be recognised by the parser. In XML we would have an element such as > <OptionalData> with an attribute such as "type". Both of these would be > part of the structure and _any_ program using the schema as import would > recognise them. The value of the attribute and the element would GEDCOM structure also allows for non-standard tags. It also has a few poorly defined tags that have to be clarified by a free-form TYPE or NOTE. It even added the PEDI and TYPE tags to put free-form modifiers on the formerly narrowly defined NAME and CHIL tags. And of course, there are many well-defined tags that some programs do not support. Whether an item is a GEDCOM tag or an XML element, what does a program do that has no place for it in its internal data structures? Unless you forbid it, it will do what the LDS's own PAF program does: stick it in a NOTE. (or discard it completely). Either response prevents the data from being properly used after that by a program that _does_ support it. Recoding the GEDCOM 5.5 data model into the 6.0 XML structure does not solve any problem with the data model. Nor does offering a new structure solve the problem of the spec not being followed. The solution to GEDCOM's problems involves not only a better data model, but also sufficient demand that software vendors feel the need to adopt it. Several of the former may exist, but the latter does not. -- Wes Groleau If you put garbage in a computer nothing comes out but garbage. But this garbage, having passed through a very expensive machine, is somehow ennobled and none dare criticize it.

    12/22/2007 06:21:43
    1. Re: GEDOM as a database format
    2. singhals
    3. Wes Groleau wrote: > Ian Goddard wrote: > >> The problem is that the unknown tag is part of the structure and has >> to be recognised by the parser. In XML we would have an element such >> as <OptionalData> with an attribute such as "type". Both of these >> would be part of the structure and _any_ program using the schema as >> import would recognise them. The value of the attribute and the >> element would > > > GEDCOM structure also allows for non-standard tags. > It also has a few poorly defined tags that have to be clarified > by a free-form TYPE or NOTE. It even added the PEDI and TYPE tags > to put free-form modifiers on the formerly narrowly defined NAME > and CHIL tags. And of course, there are many well-defined tags > that some programs do not support. Whether an item is a GEDCOM tag > or an XML element, what does a program do that has no place for it > in its internal data structures? Unless you forbid it, it will > do what the LDS's own PAF program does: stick it in a NOTE. > (or discard it completely). > > Either response prevents the data from being properly used after that > by a program that _does_ support it. > > Recoding the GEDCOM 5.5 data model into the 6.0 XML structure does > not solve any problem with the data model. Nor does offering a new > structure solve the problem of the spec not being followed. > > The solution to GEDCOM's problems involves not only a better data model, > but also sufficient demand that software vendors feel the need to > adopt it. Several of the former may exist, but the latter does not. > Precisely. To rehash my "eye color: Harvard" vs "Education: Grey" canard, if the data aren't equal/standard, then the exchange cannot be standardized. Which pretty much boils down to, if everyone used MY program-of-choice then data would transfer much more smoothly. Or, phrased differently: the only way for BigWally in St. George UT to receive 12-oz wine glasses from 8 different shippers is for 8 different shippers to SHIP 12-oz wine glasses. If the shipper ships 11-oz wine glasses, the means of shipping (FedEx, USPS, UPS, DHL, RailwayExpress, or Greyhound Bus) won't make 'em 12-ozs when they get to UT. IMO Cheryl

    12/22/2007 03:19:03
    1. Re: GEDOM as a database format
    2. Ian Goddard
    3. T.M. Sommers wrote: > Ian Goddard wrote: > > Perhaps I should have been more explicit and said that I see no > advantages of XML over GEDCOM for the uses to which GEDCOM is put, > namely passing data back and forth between genealogical programs. But the purpose for which GEDCOM was designed, as another poster has made clear, was more restricted than that - it was to pass data to and from the LDS database and nothing more. That's why the definition only includes data elements of interest to LDS. It's other programs which have latched onto it and, AIUI, added their own private extensions. Let me take one of your points out of sequence to elaborate on this: > > The main complaint, I think, is that programs do not implement all of > standard GEDCOM. There is no reason to believe that that would change > if GEDCOM were replaced by XML. > > And it is a bit ironic to use the *Extensible* Markup Language to > prevent genealogical programs from extending their data-transfer language. > >> One way to handle optional stuff like that in XML would be to use >> attributes, e.g. >> >> <OptionalData type="CauseOfDeath">Drowning</OptionalData> > > Ugh. This is better than > > 2 CAUS Drowning > > how? > Fair enough. Having given the choice of example a couple of seconds thought I ended up with something for which GEDCOM has a tag and for which an XML format would probably also have a standard element. So let's try something for which it doesn't, criminal conviction. Or military service. Or a rite of passage in a religion other than Christianity or Judaism. Or manumission. Or the charter and manorial court references to which we have to resort when we get back beyond parish registers. AIUI the response of genealogy S/W developers is to coin their own tags. Which is fine if we want to pass data between users of that particular program. Not so fine if users of two programs want to exchange data or a user wants to migrate to another program. What should a program do if it encounters an unknown tag? Discard it silently? Reject the structure to which it belongs? Reject the whole file? The problem is that the unknown tag is part of the structure and has to be recognised by the parser. In XML we would have an element such as <OptionalData> with an attribute such as "type". Both of these would be part of the structure and _any_ program using the schema as import would recognise them. The value of the attribute and the element would probably be represented internally as a name/value pair in some way. The essential point is that the program doesn't have to recognise the value of the type attribute in the same way that it would have to recognise the non-standard tag in a GEDCOM style situation. > You seem to want to use XML as some sort of reporting language for your own > database. I suppose it would work for that, but why bother? If it's > your own program, you have direct access to the data, and can directly > create any kind of report you want, without the hassle of first > converting everything to XML first.. > No what I want is something which doesn't start off by being designed to fill a specific role and has to be stretched in an ad hoc manner to do related stuff. I have no quarrel with GEDCOM not being open-ended in its abilities. It was designed for a purpose and it fulfills it. But that purpose is essentially to collect sequences of names and dates and some LDS specific stuff. Surely we can aspire to more than that? >>> The availability of parsing tools for XML is not a real advantage. >>> You still have to write the callbacks, and that is where most of the >>> work is. A GEDCOM line is trivial to parse: it is either LEVEL TAG >>> DATA or LEVEL XREF TAG. >> >> That's for a SAX parser. > > DOM, too. > The Delphi app to which I referred used a SAX parser for which I had to write event handlers which built up an object which was then passed to a single call back. It also used the MS DOM for which I had to write no event handler - I just used it as an XSLT engine. (If it sounds odd using two parsers in tandem it was because the raw XML documents were so large that the DOM would have run out of memory. The SAX parser could chomp its way through the documents chopping it up into bite-size chunks to feed to the DOM.) -- Ian Hotmail is for spammers. Real mail address is igoddard at nildram co uk

    12/22/2007 02:23:00
    1. Re: GEDOM as a database format
    2. "T.M. Sommers" <tms@nj.net> wrote: <snip> > > Perhaps I should have been more explicit and said that I see no > advantages of XML over GEDCOM for the uses to which GEDCOM is > put, namely passing data back and forth between genealogical > programs. You seem to want to use XML as some sort of reporting <snip> > Except that XML is specifically extensible. Meaning that at SOME point the user or software would be required to handle "extra" data that may be an enhancement, or partially relevant/not relevant to the software and user's task. Having that built-in to the file format (and therefore the software) is a HUGE step forward, imo. Data from a multitude of good sources (along with bad sources) is on the web. Eventually ALL of the data from good sources will be tagged using XML tags at some point. Imagine good data, properly tagged, able to be read by your software and auto-tagged with it's source info, and PASSED ON because the file format supports it.

    12/22/2007 09:46:38
    1. Re: GEDOM as a database format
    2. Tehenne
    3. T.M. Sommers <tms@nj.net> wrote: > The main complaint, I think, is that programs do not implement > all of standard GEDCOM. There is no reason to believe that that > would change if GEDCOM were replaced by XML. Nothing to add moreover. The program, using a database, which respects more the format gedcom (95%) is that that I wrote. It is a choice which the others did not make then, I do not see in what the XML could change some things there. All the remainder is an idle discusion ;o) Merry Christmas. -- Téhenne Saint-Denis de la Réunion genealogy program : ohmiGene (Mac & PC): http://www.nauze.com/ Comparative analysis Import-Export Gedcom: http://www.nauze.com/gedcom/ Digest du format Gedcom : http://www.nauze.com/format_gedcom/index.html Sorry for my english

    12/22/2007 09:01:31
    1. Re: GEDOM as a database format
    2. Hugh Watkins
    3. T.M. Sommers wrote: > Ian Goddard wrote: > >> T.M. Sommers wrote: >> >>> >>> I fail to see any technological advantages of XML over GEDCOM. Their >>> syntax is different, but their semantics are essentially the same: >>> both are lists of trees. >> >> >> The advantage of XML is the range of tools it provides. Given an XMLT >> engine such as Saxon you could transform your XML output into HMLT by >> writing a suitable stylesheet. Or into FO for a pretty-printer >> report. Or into an SVG file for a diagram. > > > Perhaps I should have been more explicit and said that I see no > advantages of XML over GEDCOM for the uses to which GEDCOM is put, > namely passing data back and forth between genealogical programs. You > seem to want to use XML as some sort of reporting language for your own > database. I suppose it would work for that, but why bother? If it's > your own program, you have direct access to the data, and can directly > create any kind of report you want, without the hassle of first > converting everything to XML first.. > >>> The availability of parsing tools for XML is not a real advantage. >>> You still have to write the callbacks, and that is where most of the >>> work is. A GEDCOM line is trivial to parse: it is either LEVEL TAG >>> DATA or LEVEL XREF TAG. >> >> >> That's for a SAX parser. > > > DOM, too. > >> I've used the Woods SAX parser in Delphi. The handler and call-backs >> are small compared to the volume of code which is simply reused. > > > No smaller than the comparable code to handle GEDCOM would be. > >>> GEDCOM allows users to define their own tags, and I don't know if XML >>> validating parsers can handle that. >> >> >> The main advantage of a validating parser is to prevent just that! In >> any thread about GEDCOM one usually finds complaints about limited >> inter-operability resulting from one package's not understanding >> anther's tags. > > > The main complaint, I think, is that programs do not implement all of > standard GEDCOM. There is no reason to believe that that would change > if GEDCOM were replaced by XML. > > And it is a bit ironic to use the *Extensible* Markup Language to > prevent genealogical programs from extending their data-transfer language. > >> One way to handle optional stuff like that in XML would be to use >> attributes, e.g. >> >> <OptionalData type="CauseOfDeath">Drowning</OptionalData> > > > Ugh. This is better than > > 2 CAUS Drowning > > how? this grinds on a gedcom is a text document an xml document is also text but with mark ups using css or other templates you can edit in more ways csv . . publication for human eyes is to paper or screen with or without attached or imbedded imsges or media any family tree program is a text editor it is possible to upload a gedcom to a database like Custodianm 3 for searches uploading a gedcom to http://worldconnect.rootsweb.com/ backs it up off site as a freebie and privatises it with the 1930 US Federsl census as the model cut off point point it also allows other ways of viewing and searching the data sharing is also to ancestry.com and its little sisters I also took an anentafel generated in http://worldconnect.rootsweb.com/ saved it and edited it in a wysywyg html editor and uploaded it elsewhere so it sits well in google googling my mothers maiden name http://www.google.com/search?num=100&hl=en&newwindow=1&q=%22alison+mary%22+lapham&btnG=Search enjoy and have a happy Christmas and a great New Year Hugh W -- For genealogy and help with family and local history in Bristol and district http://groups.yahoo.com/group/Brycgstow/ http://snaps4.blogspot.com/ photographs and walks GENEALOGE http://hughw36.blogspot.com/ MAIN BLOG

    12/22/2007 04:21:35
    1. Re: GEDOM as a database format
    2. T.M. Sommers
    3. Ian Goddard wrote: > T.M. Sommers wrote: >> >> I fail to see any technological advantages of XML over GEDCOM. Their >> syntax is different, but their semantics are essentially the same: >> both are lists of trees. > > The advantage of XML is the range of tools it provides. Given an XMLT > engine such as Saxon you could transform your XML output into HMLT by > writing a suitable stylesheet. Or into FO for a pretty-printer report. > Or into an SVG file for a diagram. Perhaps I should have been more explicit and said that I see no advantages of XML over GEDCOM for the uses to which GEDCOM is put, namely passing data back and forth between genealogical programs. You seem to want to use XML as some sort of reporting language for your own database. I suppose it would work for that, but why bother? If it's your own program, you have direct access to the data, and can directly create any kind of report you want, without the hassle of first converting everything to XML first.. >> The availability of parsing tools for XML is not a real advantage. >> You still have to write the callbacks, and that is where most of the >> work is. A GEDCOM line is trivial to parse: it is either LEVEL TAG >> DATA or LEVEL XREF TAG. > > That's for a SAX parser. DOM, too. > I've used the Woods SAX parser in Delphi. The > handler and call-backs are small compared to the volume of code which is > simply reused. No smaller than the comparable code to handle GEDCOM would be. >> GEDCOM allows users to define their own tags, and I don't know if XML >> validating parsers can handle that. > > The main advantage of a validating parser is to prevent just that! In > any thread about GEDCOM one usually finds complaints about limited > inter-operability resulting from one package's not understanding > anther's tags. The main complaint, I think, is that programs do not implement all of standard GEDCOM. There is no reason to believe that that would change if GEDCOM were replaced by XML. And it is a bit ironic to use the *Extensible* Markup Language to prevent genealogical programs from extending their data-transfer language. > One way to handle optional stuff like that in XML would be to use > attributes, e.g. > > <OptionalData type="CauseOfDeath">Drowning</OptionalData> Ugh. This is better than 2 CAUS Drowning how? -- Thomas M. Sommers -- tms@nj.net -- AB2SB

    12/21/2007 08:46:00
    1. Re: Index cards
    2. Peter J Seymour
    3. singhals wrote: > Peter J Seymour wrote: > >> One form of output available from Gendatam Suite is index cards. >> Simple enough so far. However, a problem is how to restrict data, even >> summary data, so that it does not overflow. I have not used index >> cards in practice (at least not in genealogy) so I have a question for >> anyone who has: What do you do when the card fills up? Do you just >> squash on a bit more, do you go over to the back or do you go to a >> continuation card, or even some combination of these. I would be >> interested to hear of experience of this. >> Peter > > > > People I know/knew who use/used the index-card system put one fact per > 3x5 card. These days, apparently, that should be written put one piece > of evidence per card. As in: > > CRESAP, Thomas (1) Entrydate: ..... > > FACT: ..... > SOURCE: ..... > > > > > Where line one represents the name of the person mentioned in the fact > and the associated (ID #) and the date the card was created. A blank > space is left to make the file-clerk's life easier, then the > fact/evidence is recorded baldly on line 3 and the source of the > fact/evidence is on the next line. > > One of them puts the repository and its contact info on the back of the > card. > > All the cards about CRESAP, Thomas (1) are filed in one segment. > Critical thinking/data analysis occurs when they are all taken out and > arranged in various orders. > > Some transfer conclusions to a card of a different color; and some then > discard all the others. > > Eventually you end up with a yellow card that says > > CRESAP, Thomas (1) > bapt when, where > marr when, where, to whom > died when, where > ISSUE: listed with (ID #) > > > Cheryl > Thanks, that makes sense. The point seems to be not to try and put too much on a card. Peter

    12/21/2007 02:55:10
    1. Re: Index cards
    2. singhals
    3. Peter J Seymour wrote: > One form of output available from Gendatam Suite is index cards. Simple > enough so far. However, a problem is how to restrict data, even summary > data, so that it does not overflow. I have not used index cards in > practice (at least not in genealogy) so I have a question for anyone who > has: What do you do when the card fills up? Do you just squash on a bit > more, do you go over to the back or do you go to a continuation card, or > even some combination of these. I would be interested to hear of > experience of this. > Peter People I know/knew who use/used the index-card system put one fact per 3x5 card. These days, apparently, that should be written put one piece of evidence per card. As in: CRESAP, Thomas (1) Entrydate: ..... FACT: ..... SOURCE: ..... Where line one represents the name of the person mentioned in the fact and the associated (ID #) and the date the card was created. A blank space is left to make the file-clerk's life easier, then the fact/evidence is recorded baldly on line 3 and the source of the fact/evidence is on the next line. One of them puts the repository and its contact info on the back of the card. All the cards about CRESAP, Thomas (1) are filed in one segment. Critical thinking/data analysis occurs when they are all taken out and arranged in various orders. Some transfer conclusions to a card of a different color; and some then discard all the others. Eventually you end up with a yellow card that says CRESAP, Thomas (1) bapt when, where marr when, where, to whom died when, where ISSUE: listed with (ID #) Cheryl

    12/20/2007 03:45:36
    1. Re: Use of XML?
    2. Ian Goddard <goddai01@hotmail.co.uk> wrote: > Everett M. Greene wrote: >> >> How does machine manipulation "improve" genealogical data? >> > > Store it. Edit it. Find it. Sort it. Reformat it into pretty print > reports, web pages or diagrams. > > Not necessarily improving the data but improving your use of it. > > In retrospect my other reply should have been more to the point like Ian's was, so I'll add: Reducing the data to just what it is, the labels given to it known, and the relationship method (if any) known. Like in the date question example being discussed.

    12/19/2007 03:46:07
    1. Re: Use of XML?
    2. Ian Goddard
    3. Everett M. Greene wrote: > > How does machine manipulation "improve" genealogical data? > Store it. Edit it. Find it. Sort it. Reformat it into pretty print reports, web pages or diagrams. Not necessarily improving the data but improving your use of it. -- Ian Hotmail is for spammers. Real mail address is igoddard at nildram co uk

    12/19/2007 03:22:24
    1. Index cards
    2. Peter J Seymour
    3. One form of output available from Gendatam Suite is index cards. Simple enough so far. However, a problem is how to restrict data, even summary data, so that it does not overflow. I have not used index cards in practice (at least not in genealogy) so I have a question for anyone who has: What do you do when the card fills up? Do you just squash on a bit more, do you go over to the back or do you go to a continuation card, or even some combination of these. I would be interested to hear of experience of this. Peter

    12/19/2007 11:19:16
    1. Re: Use of XML?
    2. singhals <singhals@erols.com> wrote: > JD <jd4x4@ wrote: > >> Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote: >> >> <snip> >>>programs... How is a program that expects: >>> >>><date>mm/dd/yyyy</date> >>> >>>going to handle a file generated by a program that produces: >>> >>><date> >>> <year>yyyy</year> >>> <month>mm</month> >>> <day>dd</dd> >>></date> >>> >> Simply by "remapping" back & forth because the expected formats are >> defined in each schema. We can even add an attribute that indicates >> the calendar type without mucking things up. >> > > Don't forget 12 Thermidor XII when you're indicating > calendar types; and don't miss the Hebrew and Hindu > calendars which are lunar, or the occasional 13th month in > the Hindu calendar or the difference between light 2nd and > dark 2nd of the month and what's going to happen to the 2nd > 14th of the month? > :-) I'm SO glad you're interested in the subject of XML! (really!) So, now I'm interested in how you use those calendars in your daily work with info about people/places/relationships, etc. but that's another topic. I, for one, WOULD forget about 12 Thermidor XII (that's an order of food for a large group of people, right?). Not that it really matters to XML.. what would really matter is what are the basic elements and if/how do they relate to the others in the data set. I may not have been aware of these calendars before you shared your schema with me, but since you used a calendar tag, I know that they now at least fit (somehow) into my uses that require a "date" element but without knowing for sure what the actual elements are/mean to you, I'll have to either store them or discard them. And I can't remap my data to send back to you, so unless I already have some of your "calendar" tags just stored & unadulterated to send, I'm afraid I won't be of much help to you. Maybe you can calculate in your schema how to map them to mine since you know the calendar I used? NOW we're getting someplace.. <tangible objects> <people> <cheryl></cheryl> <JD></JD> </people> </tangible objects> <not tangible objects> <concepts> <time/space> <measure> <calendar> <gregorian> <year></year> <month></month> <day></day> <12 Thermidor XII></12 Thermidor XII> </calendar> </measure> </time/space> </concepts> </not tangible objects> So that's my data model for my use, right at this moment. Personally, there isn't anything that I can think of (right now) that I couldn't classify into at least the two core elements. I rarely do that because it isn't of great use to me unless I'm in deep thought or something (like now!). But, that's how I could use your info, if you could at least put it into that context of (minimum) the two core elements. It would be nice not to have to go through all of that if there was a common point at which we diverged, don't you think? Since you know my grouping, it benefits both of us to at least find the most useful common point and assume the rest, imo. And, we can have computers do a lot of the categorizing and relationship calculations.

    12/19/2007 10:11:59
    1. Re: Use of XML?
    2. Ian Goddard
    3. singhals wrote: > > > Could some of the problems be cured _IF_ the genie program were used to > record ONLY conclusions? I suspect that this is the only reasonable way to use some of them. Nobody ever responded to my challenge to name one that would have the functionality equivalent to index cards and paper clips. > > Ask any Defense attorney. > Don't! They're too expensive to talk to. I spent plenty of time being asked questions by them, however. -- Ian Hotmail is for spammers. Real mail address is igoddard at nildram co uk

    12/19/2007 10:09:27
    1. Re: Use of XML?
    2. Ian Goddard
    3. Everett M. Greene wrote: > JD <jd4x4@<del.this>verizon.net> writes: >> Ian Goddard <goddai01@hotmail.co.uk> wrote: > >>> In practice it seems that the GEDCOM type of model has >>> influenced genealogical S/W to the extent that there doesn't >>> seem to be any real advance on it. >> Again, one of the problems coming out of GEDCOM is that the DATA is >> getting mixed with the model. The data is and will always be what it is, >> right? The differences are in how we each use it and think of it, imo. > > All this discussion is interesting (and we hope useful), but > could there be some elaboration on the above points? It would > seem that genealogical info describes a network of parents > to/from children with nets spliced/joined by marriages. How > does one "improve" on this "model"? I think the knowledge that we all have a network of parents tracing us back to Pooh-Bah's "protoplasmic globule" has proved to be too much of a temptation for many S/W authors. It's a nice tempting structure on which to base a program. The trouble is that although we know that the structure exists we don't actually *know* who occupied its nodes. Half a working life spent investigating the past has left me acutely aware that what happened in the past, the evidence it left behind and my interpretation of it are three different things - and your interpretation would be a fourth. The past is gone. We can't go there. The best we can do is look for the evidence, analyse it and interpret it. In any form of investigation keeping evidence and interpretation distinct is essential and this, ISTM, is what most, if not all genealogical S/W fails to do. If we find that John, son of William Smith was baptised in 1692 this gives us a couple of names and roles. There are very few ways in which S/W handles this. One way, the individual-centred, is to invite us to edit the records for John and William if they exist and add the event and date or to create a new record for either if it doesn't exist. But the decision as to whether the individuals in this newly discovered event are the same as those already in the database is interpretation. So whether we choose to enter or edit we are leaping straight to interpretation and relegating the evidence to a footnote. Another way, event-centred, is to allow us to add the event and create new individual records for John and William. This gives us what appears to be a nicely structured representation of the evidence, a record of the event and records of the names and roles of its participants. We can move to interpretation in our own time, deferring for as long as we wish, the question of whether, for instance this John is the same John as the one who married in 1714 and was the father of Mary Smith baptised in 1715. If, however, we decide that the infant and the bridegroom were indeed one and the same the only option available is to merge the two records. What appeared initially to be a record which was part of the evidence structure has suddenly been treated as part of the interpretation. If we were to decide that we shouldn't have merged them we will have to hand-craft another record to replace the one that was discarded during the merge and patch up the links. What we need is an evidence-centred approach which will allow us to enter the event and the name/role records as part of the evidence and retain them as permanent records. We would have a different record which would represent our historical reconstruction (interpretation) of John Smith. We would then link both our evidential records of John to this. If we change our mind about the interpretation all we need to do is delete the link we don't want. Alternatively we might have a link which carries enough information to record that we're not sure whether the identification is correct. We could even link an evidential record to more than one reconstruction until we make up our minds. In data modeling terms the evidential and interpretational records are different kinds of entity even if they're both characterised by a name. This, to my mind, is the minimum model needed to underpin good investigative practice. Personally I'd want to make the link which represents the identification between evidence and interpretation into an entity in its own right, I'd want to make sources into a hierarchy capable of representing such structure as archives, collections and documents (Gramps is a good example of this) and I'd want each record to have a globally unique identifier (see http://en.wikipedia.org/wiki/Globally_Unique_Identifier) to facilitate information sharing. This produces a more complex model than that which underlies GEDCOM. We could use GEDCOM to represent the evidential view or the interpretational view but we can't use it do both at the same time nor can it tell us which view it's representing. -- Ian Hotmail is for spammers. Real mail address is igoddard at nildram co uk

    12/19/2007 06:39:50
    1. Re: Use of XML?
    2. singhals
    3. Ian Goddard wrote: > Everett M. Greene wrote: > >> JD <jd4x4@<del.this>verizon.net> writes: >> >>> Ian Goddard <goddai01@hotmail.co.uk> wrote: >> >> >>>> In practice it seems that the GEDCOM type of model has influenced >>>> genealogical S/W to the extent that there doesn't >>>> seem to be any real advance on it. >>> >>> Again, one of the problems coming out of GEDCOM is that the DATA is >>> getting mixed with the model. The data is and will always be what it is, >>> right? The differences are in how we each use it and think of it, imo. >> >> >> All this discussion is interesting (and we hope useful), but >> could there be some elaboration on the above points? It would >> seem that genealogical info describes a network of parents >> to/from children with nets spliced/joined by marriages. How >> does one "improve" on this "model"? > > > I think the knowledge that we all have a network of parents tracing us > back to Pooh-Bah's "protoplasmic globule" has proved to be too much of a > temptation for many S/W authors. It's a nice tempting structure on > which to base a program. The trouble is that although we know that the > structure exists we don't actually *know* who occupied its nodes. > > Half a working life spent investigating the past has left me acutely > aware that what happened in the past, the evidence it left behind and my > interpretation of it are three different things - and your > interpretation would be a fourth. > > The past is gone. We can't go there. The best we can do is look for > the evidence, analyse it and interpret it. In any form of investigation > keeping evidence and interpretation distinct is essential and this, > ISTM, is what most, if not all genealogical S/W fails to do. > > If we find that John, son of William Smith was baptised in 1692 this > gives us a couple of names and roles. There are very few ways in which > S/W handles this. > > One way, the individual-centred, is to invite us to edit the records for > John and William if they exist and add the event and date or to create a > new record for either if it doesn't exist. But the decision as to > whether the individuals in this newly discovered event are the same as > those already in the database is interpretation. So whether we choose > to enter or edit we are leaping straight to interpretation and > relegating the evidence to a footnote. > > Another way, event-centred, is to allow us to add the event and create > new individual records for John and William. This gives us what appears > to be a nicely structured representation of the evidence, a record of > the event and records of the names and roles of its participants. We > can move to interpretation in our own time, deferring for as long as we > wish, the question of whether, for instance this John is the same John > as the one who married in 1714 and was the father of Mary Smith baptised > in 1715. If, however, we decide that the infant and the bridegroom were > indeed one and the same the only option available is to merge the two > records. What appeared initially to be a record which was part of the > evidence structure has suddenly been treated as part of the > interpretation. If we were to decide that we shouldn't have merged them > we will have to hand-craft another record to replace the one that was > discarded during the merge and patch up the links. > > What we need is an evidence-centred approach which will allow us to > enter the event and the name/role records as part of the evidence and > retain them as permanent records. We would have a different record > which would represent our historical reconstruction (interpretation) of > John Smith. We would then link both our evidential records of John to > this. If we change our mind about the interpretation all we need to do > is delete the link we don't want. Alternatively we might have a link > which carries enough information to record that we're not sure whether > the identification is correct. We could even link an evidential record > to more than one reconstruction until we make up our minds. In data > modeling terms the evidential and interpretational records are different > kinds of entity even if they're both characterised by a name. This, to > my mind, is the minimum model needed to underpin good investigative > practice. Personally I'd want to make the link which represents the > identification between evidence and interpretation into an entity in its > own right, I'd want to make sources into a hierarchy capable of > representing such structure as archives, collections and documents > (Gramps is a good example of this) and I'd want each record to have a > globally unique identifier (see > http://en.wikipedia.org/wiki/Globally_Unique_Identifier) to facilitate > information sharing. > > This produces a more complex model than that which underlies GEDCOM. We > could use GEDCOM to represent the evidential view or the > interpretational view but we can't use it do both at the same > time nor can it tell us which view it's representing. > Could some of the problems be cured _IF_ the genie program were used to record ONLY conclusions? Back in the long-long-ago, before we had computer genealogy programs, we had working forms and final forms; the final form was prettily written in ink on quality paper and represented our conclusions. The working forms were scribbled in pencil on something approaching newsprint, stapled together by name, one form per source. When we REACHED a conclusion, we annotated the final one, on the back, with remarks like "The family of the same names in South Fork is a different family" or "There is a family in North Branch which is very similar to this family." I find it inefficient to constantly have to re-evaluate my data. Often clue which didn't make it to the Xerox copy or the digital image contributed to a thought which led me to another source where I found what I considered conclusive evidence. I don't always remember why I went to look in East Widget records when the family before and after was in South Som'ers...without remembering why, I can easily reach a different conclusion. Remember, no fact cannot be misinterpreted in at least three different ways, and most evidence can also be interpreted many ways. Ask any Defense attorney. Cheryl

    12/19/2007 04:19:22
    1. Re: Use of XML?
    2. Everett M. Greene
    3. JD <jd4x4@<del.this>verizon.net> writes: > mojaveg@mojaveg.lsan.mdsg-pacwest.com (Everett M. Greene) wrote: > > JD <jd4x4@<del.this>verizon.net> writes: > >> Ian Goddard <goddai01@hotmail.co.uk> wrote: > > > >> > In practice it seems that the GEDCOM type of model has > >> > influenced genealogical S/W to the extent that there doesn't > >> > seem to be any real advance on it. > >> > >> Again, one of the problems coming out of GEDCOM is that the DATA is > >> getting mixed with the model. The data is and will always be what it > >> is, right? The differences are in how we each use it and think of it, > >> imo. > > > > All this discussion is interesting (and we hope useful), but > > could there be some elaboration on the above points? It would > > seem that genealogical info describes a network of parents > > to/from children with nets spliced/joined by marriages. How > > does one "improve" on this "model"? > > Not just this particular "model", because there are others that use the > same core data. I'm thinking that since many "data babies" get thrown > out with "genealogy", "family history", and who knows what other bath > water every day that goes by which baths are taken... that at least a > mechanism to capture and improve the data sets with machines is in > order. How does machine manipulation "improve" genealogical data? > And, I see "XML" as being a help with that. Now, "XML" covers a LOT of > ground, not just a "data model", or a "standard", or even a "language".. > it's strength lies in smart use of it together in the correct > ever-expanding implementation of all of these aspects, imo. Data that's > taggged in at least a basic improved fashion every day could be of use. > Also, apps for genealogy should help with the exchanges and tagging at > least in the background (like in the date example & my scenario in the > previous post) if they were XML compliant. > > Hope this makes a little sense.. I'm late for a task & in a rush!

    12/19/2007 01:09:16
    1. Re: Use of XML?
    2. Ian Goddard <goddai01@hotmail.co.uk> wrote: > JD <jd4x4@ wrote: >> >> http://www.genopro.com/ > > I had a quick look at the site. It's a Windows package so I'd need to > find out if it runs under Wine. > I can't swear to it, but I thought that I saw a reference to bit running under Wine, but don't remember the context. Look through the forums and then maybe the developer info. The site itself isn't horribly revealing for much lying below the surface. > It looks as if it's essentially a graphical interface. I wonder how > standard their symbology actually is. In the past I've used the class I think I also saw a reference to configurable symbol sets, but at the time I was trying to find out if the graphics in general used SGV, but have tentatively concluded they may not. But, when I get a chance I'm going to try a demo install to really poke about. > diagram facility of a UML package for diagramming family relationships, > mostly because it was there and I could cobble up a diagram on it fairly > quickly. A good UML package should be able to export data as XMI. So, > If I were to put a little effort into setting up proper > stereotypes.....hmmm. >

    12/18/2007 07:16:54
    1. Re: Use of XML?
    2. Hugh Watkins
    3. Everett M. Greene wrote: > JD <jd4x4@<del.this>verizon.net> writes: > >>Ian Goddard <goddai01@hotmail.co.uk> wrote: > > >>>In practice it seems that the GEDCOM type of model has >>>influenced genealogical S/W to the extent that there doesn't >>>seem to be any real advance on it. >> >>Again, one of the problems coming out of GEDCOM is that the DATA is >>getting mixed with the model. The data is and will always be what it is, >>right? The differences are in how we each use it and think of it, imo. > > > All this discussion is interesting (and we hope useful), but > could there be some elaboration on the above points? It would > seem that genealogical info describes a network of parents > to/from children with nets spliced/joined by marriages. How > does one "improve" on this "model"? just open any small gedcom in a text editor and study the links by the @@@ Hugh W -- For genealogy and help with family and local history in Bristol and district http://groups.yahoo.com/group/Brycgstow/ http://snaps4.blogspot.com/ photographs and walks GENEALOGE http://hughw36.blogspot.com/ MAIN BLOG

    12/18/2007 02:49:32