On Tue, 19 Feb 2013 11:02:25 +0000, Ian Goddard <goddai01@hotmail.co.uk> wrote: >Tony Proctor wrote: >> "Steve Hayes" <hayesstw@telkomsa.net> wrote in message >> news:31g6i85ir4tlbhov78e19vf81e19c8toab@4ax.com... >>> In an earlier message I suggested using AWK to manipulate a GEDCOM file to >>> solve a particular problem. >>> >>> That point tended to get lost in discussion of other points like using >>> other >>> ways to solve the problem, or discussion of flaws in the GEDCOM data model >>> itself and proposals for its replacement, which I see as a separate >>> question. >>> >>> What I would like to see is the development of a kind of library of AWK >>> routines to manipulate GEDCOM files. Lots of genealogists have GEDCOM >>> files, >>> and some would like to make changes to them, or extract information from >>> them >>> in ways that might not be possible with other genealogy programs. >> >> If you're OK using AWK Steve then I would recommend a more reliable >> approach. Textual manipulation to solve a problem is usually >> less-than-satisfactory due to ambiguities looking at plain text, and the >> fact that a simple text-processing language cannot easily understand the >> grammar of something like GEDCOM. >> >> In a similar vein, no one would (or should) try and manipulate an XML file >> directly from its textual representation. They would first load it into a >> DOM (Document Object Model) before processing the associated objects. >> >> I would recommend loading the GEDCOM file into an object-representation, in >> memory, and manipulate its objects instead. I have seen free tools for doing >> this although I do not have a reference to hand. > >This idea also occurred to me. I dismissed it fairly quickly. I think you are being too dismissive. I'm not taking about XML files, but about Gedcom files, and I'm not talking about a DOM, but about AWK. And I'm not taking about some hypothetical Platonic ideal of the perfect Gedcom replacement, but about the actual Gedcom files that millions of genealogists have on their computers now. These "you can't get there from here" comments are really not very helpful. > >First it just becomes Yet Another GEDCOM Based Application. Like all >the other YAGBAs it has the problems of dealing with the ways GEDCOM has >been twisted by all the other YAGBAs & their users. Maybe it would need >some sort of expert system to understand how to parse incoming GEDCOM >files from A & how to write them for B. > >Then it would need all manner of editing functions otherwise someone >would be complaining that although it does what someone else wanted it >doesn't do what they want, and, of course, an easy-to-use user interface >to all this. Then why can't it just provide for new data entry as well? > And why can't it display a nice tree? And do reports? And and and... > >If its import and export facilities were good enough to be the universal >GEDCOM handler it would need to be scope creep would be inevitable. By >v2.0 it would probably adding its own GEDCOM semantics & become part of >the problem. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
Steve Hayes wrote: > On Tue, 19 Feb 2013 11:02:25 +0000, Ian Goddard <goddai01@hotmail.co.uk> > wrote: > >> Tony Proctor wrote: >>> "Steve Hayes" <hayesstw@telkomsa.net> wrote in message >>> news:31g6i85ir4tlbhov78e19vf81e19c8toab@4ax.com... >>>> In an earlier message I suggested using AWK to manipulate a GEDCOM file to >>>> solve a particular problem. >>>> >>>> That point tended to get lost in discussion of other points like using >>>> other >>>> ways to solve the problem, or discussion of flaws in the GEDCOM data model >>>> itself and proposals for its replacement, which I see as a separate >>>> question. >>>> >>>> What I would like to see is the development of a kind of library of AWK >>>> routines to manipulate GEDCOM files. Lots of genealogists have GEDCOM >>>> files, >>>> and some would like to make changes to them, or extract information from >>>> them >>>> in ways that might not be possible with other genealogy programs. >>> >>> If you're OK using AWK Steve then I would recommend a more reliable >>> approach. Textual manipulation to solve a problem is usually >>> less-than-satisfactory due to ambiguities looking at plain text, and the >>> fact that a simple text-processing language cannot easily understand the >>> grammar of something like GEDCOM. >>> >>> In a similar vein, no one would (or should) try and manipulate an XML file >>> directly from its textual representation. They would first load it into a >>> DOM (Document Object Model) before processing the associated objects. >>> >>> I would recommend loading the GEDCOM file into an object-representation, in >>> memory, and manipulate its objects instead. I have seen free tools for doing >>> this although I do not have a reference to hand. >> >> This idea also occurred to me. I dismissed it fairly quickly. > > I think you are being too dismissive. > > I'm not taking about XML files, but about Gedcom files, and I'm not talking > about a DOM, but about AWK. > > And I'm not taking about some hypothetical Platonic ideal of the perfect > Gedcom replacement, but about the actual Gedcom files that millions of > genealogists have on their computers now. > > These "you can't get there from here" comments are really not very helpful. > Steve, Note that I was replying to Tony, not you. -- Ian The Hotmail address is my spam-bin. Real mail address is iang at austonley org uk
"Steve Hayes" <hayesstw@telkomsa.net> wrote in message news:h4p6i8l277lgaqsbl8ad70jkek4injed62@4ax.com... > On Tue, 19 Feb 2013 11:02:25 +0000, Ian Goddard <goddai01@hotmail.co.uk> > wrote: > > I think you are being too dismissive. > > I'm not taking about XML files, but about Gedcom files, and I'm not > talking > about a DOM, but about AWK. > > And I'm not taking about some hypothetical Platonic ideal of the perfect > Gedcom replacement, but about the actual Gedcom files that millions of > genealogists have on their computers now. > > These "you can't get there from here" comments are really not very > helpful. > > I gave advice based on experience Steve. For every AWK file you write, I can contrive a GEDCOM example that will break it. Another post in this thread mentioned ambiguities, and having to assume the availability of special characters that won't occur in names or notes. A scripting library could be useful, but I would never use one based simply on a text-processing language. I mentioned XML, not because you wanted to use it but as an analogy. Scripting manipulation of XML is usually done with XSLT, which for all its faults and obfuscation deals with entities rather than text. Tony Proctor
On 19.02.2013 20:39, Tony Proctor wrote: > "Steve Hayes" <hayesstw@telkomsa.net> wrote in message > news:h4p6i8l277lgaqsbl8ad70jkek4injed62@4ax.com... >> On Tue, 19 Feb 2013 11:02:25 +0000, Ian Goddard <goddai01@hotmail.co.uk> >> wrote: >> >> I think you are being too dismissive. >> >> I'm not taking about XML files, but about Gedcom files, and I'm not >> talking >> about a DOM, but about AWK. >> >> And I'm not taking about some hypothetical Platonic ideal of the perfect >> Gedcom replacement, but about the actual Gedcom files that millions of >> genealogists have on their computers now. >> >> These "you can't get there from here" comments are really not very >> helpful. >> >> > > I gave advice based on experience Steve. For every AWK file you write, I can > contrive a GEDCOM example that will break it. If GEDCOM has a formal definition you will be able to write an awk program to solve your task. According to a newer posting it seem that GEDCOM is a very primitive definition; I see no reason why you cannot process it using a language like awk. > Another post in this thread > mentioned ambiguities, and having to assume the availability of special > characters that won't occur in names or notes. Though, there's a way to handle those issue. It's not impossible, you just have to be aware of it; otherwise you'd get surprises. But that just boils down to "You need to know what to do.", which is valid for every approach. > > A scripting library could be useful, but I would never use one based simply > on a text-processing language. Why? Janis > > I mentioned XML, not because you wanted to use it but as an analogy. > Scripting manipulation of XML is usually done with XSLT, which for all its > faults and obfuscation deals with entities rather than text. > > Tony Proctor > > >