On 24 Apr at 17:17, Dave Beakhust <[email protected]> wrote: > Confirming that files are "identical" itself presents a challenge. For > example, given earlier comments about inconsistency with spaces, does > one interpret all white space of whatever length as "space"? Does one > ignore or warn about trailing spaces? To enforce (say) that a hash of > the file should match exactly is a severe test. Maybe it is even > undesirable, depending on whether you allow flexibility on reading > files but try to write only files in canonical form. Example: a > placename: New Milton Should this match New Milton ? (probably a bad > example but I don't have gedcom spec handy) A better example may be > treatment of space between a tag and its value... Very fair point. But the trouble about matching eg numbers of spaces is that a special match program has to be written to allow for such exceptions. Then that has to be validated. And a one month project would end up being five years and would probably never complete. I do not think it is difficult to say that the output GEDCOM should match the input one, precisely. > Dave Beakhust > > > Sent from my iPhone > > On 24 Apr 2012, at 15:38, Tim Powys-Lybbe <[email protected]> wrote: > > > On 24 Apr at 12:01, "Adrian Bruce" <[email protected]> wrote: > > > > > My apologies to those who are bored by GEDCOM but its existence is > > > fundamental to our ability to transfer data other than in > > > human-only readable form. It concerns me that genealogical and > > > family history societies are letting software people make the > > > running on whether or not GEDCOM is replaced / enhanced / left to > > > die. > >> > > > For what it's worth as a former IT professional of 30y standing > > > writing and supporting software: > >> > > > Caroline said "when software developers blame the exporting > > > program they abdicate their own responsibility to their > > > customers". I certainly applaud those who tweak their software and > > > hope they'll gain the market share they deserve. But we need to > > > distinguish between the general responsibility to their customers > > > of providing the optimum software and the responsibility to > > > produce GEDCOM that is compliant to a standard. This > > > responsibility exists and lies with the person who writes the > > > export code. There are 2 reasons for this - if they call it a > > > GEDCOM export, it should be that, not a half-hearted attempt. > > > Secondly, if there are (say) 20 popular programs out there, and > > > you write a 21st, you seriously do not want to be writing 20 > > > different import routines plus your own export - one export and > > > one import ought to suffice. > >> > > > Sue said "Identifying "incorrect" GEDCOM is difficult because the > > > specification is not entirely clear." I'd disagree. For the most > > > part, the GEDCOM standard is perfectly clear and it annoys me that > > > so many sling around the view that GEDCOM is flawed. (I suspect > > > Sue, from her phrasing, doesn't belong to the extreme > > > mud-slingers, though). Certainly, the casual reader will not find > > > it at all clear - but that's not the target audience. In the > > > BetterGEDCOM Wiki, it proved hard for any of the IT literate > > > contributors to find an "error" in the specification - about the > > > only one that sticks in my mind is that one could have an infinite > > > loop of a Source referring to a standalone Note, which is > > > justified by the first Source, which would refer to the same > > > standalone Note, which is... > >> > > > This is NOT to say that GEDCOM is adequate for family history > > > today. It isn't. The point is that all the new standards in the > > > world won't help if the major problem is not with GEDCOM but with > > > the fact that developers either can't be bothered to read the > > > standard properly or can't be bothered to take all the steps > > > necessary to reformat their own data to fit into the GEDCOM model. > > > Neither of those problems will be fixed by a new or enhanced > > > standard. > >> > > > In essence we need a 2-pronged approach - firstly we need to > > > highlight the incompetencies of software suppliers who can't be > > > bothered to understand the difference between CONT and CONC in a > > > GEDCOM file. Secondly we need to agree on what family historians > > > want from a revision / replacement of GEDCOM. (If we want > > > anything). For instance, US genealogists tend to emphasise the > > > data that goes into citations - are UK family historians satisfied > > > with what they have in GEDCOM? Alternatively are we happy that > > > FamilySearch will drive GEDCOMX (say) and only produce something > > > to satisfy FS's needs? > >> > > > Adrian Bruce > > > > I agree with all that you say. The crux of the matter is the > > software developers who write faulty programs to export and import > > GEDCOM. > > > > Thinking further about this, I cannot see the Family Search people > > doing anything to improve GEDCOM 5.5. But what is needed for any > > version of GEDCOM is a method of testing that a genealogy program > > correctly handles it. Let's try this: > > > > 1. A standard GEDCOM file should be developed that incorporates all > > the features of GEDCOM. For GEDCOMX this can only be done by that > > team. > > > > 2. The first test of compliance for any software is that it should > > be able to receive in this standard file, create a genealogy in > > their own format and then export from their program the GEDCOM. The > > exported GEDCOM should be identical to the standard one. > > > > 3. The second test of compliance for any genealogy software is that > > the program should be used to construct all the features that the > > developers have incorporated. Then a GEDCOM should be created to > > export the data. The first sub-check of compliance should be to then > > import this GEDCOM and check that the recreated genealogy file is > > identical to the original. The second sub-check of compliance > > should be to then import the GEDCOM to another program that is > > otherwise known to be compliant and then export from that another > > GEDCOM and finally import the last GEDCOM into the program under > > test and check that the regenerated genealogy is identical to the > > original. > > > > I would expect to hear squeals of protest from the software > > developers. We would depend on the reviewers and journalists to do > > or commission these relatively simple tests in any report they made > > on the various genealogy programs. A little pressure there? -- Tim Powys-Lybbe [email protected] for a miscellany of bygones: http://powys.org/