On 2013-02-21 03:55, Steve Hayes wrote: > One of the things I would really like to have is an event-based program, but > TMG isn't it. See here: Do you think there is any scope for resurrecting gendatam, I guess as a free project? It has the feature of being individual-based, or event-based or any-way-you-want based. It is only some specific record types such as Assertions that have to be attached to another record. In the jargon it is a NoSQL system and field relationships are hardly defined at all. Works well for me.
On Wed, 20 Feb 2013 18:42:28 -0500, Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote: >On Wed, 20 Feb 2013 06:50:38 +0200, Steve Hayes <hayesstw@telkomsa.net> >declaimed the following in soc.genealogy.computing: > >> >> XML files are also data files, and some have suggested that GEDCOM be replaced >> by XML files, and perhaps they may happen one day. >> > Which won't help at all unless one can enforce one XML schema upon >all users. > > But if one could enforce one schema, one could also enforce one >truly standard GEDCOM too. > > But you know some genealogy program will want to extend the XML to >handle their special features. For all practical purposes, NO >hierarchical tree schema could fully handle the data from programs like >the defunct UFT, and still-produced TMG. > > GEDCOM is based on two basic "01" level entries: INDIviduals, and >FAMilies. Event based programs, instead, are based on life events which >are linked to individuals in particular roles. Any XML schema that also >focuses on individual and family records would have the same problem. [subject line changed and follow-ups set] I still use FHS, which I think is the only program to use the GEDCOM 1.x format to export ALL its data. It Supports PAF Gedcom, which I think was GEDCOM 2.x, but that only exports a subset of data -- it excludes events. But I think it is misleading to speak of programs like TMG as "event-based". I've tried TMG, and found it impossible to enter an event without first entering an individual, so it too is individual-based. One of the things I would really like to have is an event-based program, but TMG isn't it. See here: http://hayesgreene.blogspot.com/2011/05/event-based-history-and-genealogy.html -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
On 2/19/2013 7:15 PM, Dennis Lee Bieber wrote: > > PS E:\UserData\Wulfraed\My Documents> get-content e:\sample.ged > 0 @I1@ INDI > 1 NAME Gerald "Bernard" /Landry/ > 2 GIVN Gerald "Bernard" > 2 SURN Landry > 1 SEX M > 1 BIRT > 2 DATE 9 MAR 1937 > 2 PLAC St-Jacques > 0 @I2@ INDI > 1 NAME Bernard /St-Jacques/ > 2 GIVN Bernard > 2 SURN St-Jacques > 1 SEX M > 1 FAMS @F1@ > > PS E:\UserData\Wulfraed\My Documents> get-content e:\sample.ged | > foreach {$_ -replace '(.*) NAME (.*) "(.*)" (.*)', '$1 $2 ~$3~ $4' > -replace '(.*) GIVN (.*) "(.*)"', '$1 GIVN $2 ~$3~'} >e:\new.ged > > PS E:\UserData\Wulfraed\My Documents> get-content e:\new.ged > 0 @I1@ INDI > 1 Gerald ~Bernard~ /Landry/ You left out the NAME tag. -- T.M. Sommers -- ab2sb
On Wed, 20 Feb 2013 06:43:30 -0600, Ed Morton <mortonspam@gmail.com> wrote: >On 2/20/2013 2:25 AM, Steve Hayes wrote: >> On Wed, 20 Feb 2013 08:41:54 +0100, Janis Papanagnou >> <janis_papanagnou@hotmail.com> wrote: >> >>> On 20.02.2013 07:05, Steve Hayes wrote: >>>> On Tue, 19 Feb 2013 14:06:25 +0100, Janis Papanagnou >>>> <janis_papanagnou@hotmail.com> wrote: >>>> >>> [...] >>>>> >>>>> awk '$2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 }' >>>> >>>> I substituted "Ellwood1.ged" for "!~" and got this: >>> >>> What did you intend to do? >> >> See what happened when I ran it on a file. > >Post what you tried because what you say you did `substituted "Ellwood1.ged" for >"!~"` sounds like you replaced the regexp inequality operator with the name of a >file which doesn't make sense. if you wanted to run Janis's script on a file >named Ellwood1.ged you'd do: > >awk '$2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 }' Ellwood1.ged > >if you're on UNIX. If you're on Windows, it's different so tell us which OS >you're on. I'm on Windows. Tried it as you suggested, and this was the result: H:\>awk '$2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 }' Ellwood1.ged DOSPRN Print Spooler. Version 1.77 (c) 1990-2004 by Gurtjak D., Ignatenko I., Goldberg A. Use extended memory: 200K Use conventional memory: 4K '$2 ^ awk: line 0: syntax error H:\>gawk '$2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 }' Ellwood1.ged gawk: '$2 gawk: ^ invalid char ''' in expression H:\>gawk $2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 } Ellwood1.ged gawk: cmd. line:1: fatal: cannot open file `!~' for reading (No such file or directory) H:\> I tried first with an older version of AWK, and later with a newer one (called gawk above). When it said it didn't like the "'" characters, I removed them and tried again. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
Am 20.02.2013 17:14, schrieb Steve Hayes: [...] > > I'm on Windows. On WinDOS always put your awk program[*] in a file, say xyz.awk, and call it using option -f awk -f xyz.awk Ellwood1.ged otherwise you'll get the quoting issues that you describe below and probably other hassles as well. Janis [*] The "awk program" is only the code inside the single quotes. > > Tried it as you suggested, and this was the result: > > H:\>awk '$2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 }' Ellwood1.ged > > DOSPRN Print Spooler. Version 1.77 > (c) 1990-2004 by Gurtjak D., Ignatenko I., Goldberg A. > Use extended memory: 200K > Use conventional memory: 4K > '$2 > ^ > awk: line 0: syntax error > > H:\>gawk '$2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 }' Ellwood1.ged > gawk: '$2 > gawk: ^ invalid char ''' in expression > > H:\>gawk $2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 } Ellwood1.ged > gawk: cmd. line:1: fatal: cannot open file `!~' for reading (No such file or > directory) > > H:\> > > I tried first with an older version of AWK, and later with a newer one (called > gawk above). > > When it said it didn't like the "'" characters, I removed them and tried > again. > >
On Wed, 20 Feb 2013 09:52:47 -0500, singhals <singhals@erols.com> wrote: >(G) The loose nut on the keyboard? PICNIC Problem in chair not in computer. Hugh
On Wed, 20 Feb 2013 08:41:54 +0100, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote: >On 20.02.2013 07:05, Steve Hayes wrote: >> On Tue, 19 Feb 2013 14:06:25 +0100, Janis Papanagnou >> <janis_papanagnou@hotmail.com> wrote: >> >[...] >>> >>> awk '$2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 }' >> >> I substituted "Ellwood1.ged" for "!~" and got this: > >What did you intend to do? See what happened when I ran it on a file. I'll play with it some more. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
Denis Beauregard wrote: > On Tue, 19 Feb 2013 22:08:25 -0500, singhals<singhals@erols.com> > wrote in soc.genealogy.computing: > >> Kenny McCormack wrote: >> >>> I also agree with (whoever it was) the poster who wrote that feature-creep >>> will eventually doom this project (writing a GEDCOM lib in AWK). >> >> That's one of the things wrong with GED -- people wanted it >> to do things it was /never/ intended to do, but developers >> of GEDStand "tweaked" the standard to allow each s/w >> developer to (a) meet the minimum standard while (b) getting >> 95% on moves between two users of the same program. People >> started complaining before the GEDStan developers got out of >> their hotel, because, hey, only 95%? >> >> Cheryl >> ahhh, yes, I am still trying to forget GedStan discussions! > > You know what is wrong with Gedcom ? The format itself is something > very powerful and I am surprised it is not in use in other fields. > But the format is adapted to a hierarchical structure that is not > compatible with a straight column-based database. So, what is wrong > is not the format but its use. (G) The loose nut on the keyboard? The custom reports, saved as a csv and imported into a spreadsheet, do a fair amount of the most common sorting I want to do. When it doesn't, I go RTFM to see what I can find. ONE of the filters I'd like to have is something along the lines of "show me everyone whose surname matches C-610 and whose wife's given name contains Edna" Or, sometimes more urgent, "show me all the Edna whose husband's given name contains George." Yeah, I've got a spreadsheet, but it ought to be do-able from within the program. Cheryl
On Wed, 20 Feb 2013 09:52:47 -0500, singhals <singhals@erols.com> wrote: >Denis Beauregard wrote: >> On Tue, 19 Feb 2013 22:08:25 -0500, singhals<singhals@erols.com> >> wrote in soc.genealogy.computing: >> >>> Kenny McCormack wrote: >>> >>>> I also agree with (whoever it was) the poster who wrote that feature-creep >>>> will eventually doom this project (writing a GEDCOM lib in AWK). >>> >>> That's one of the things wrong with GED -- people wanted it >>> to do things it was /never/ intended to do, but developers >>> of GEDStand "tweaked" the standard to allow each s/w >>> developer to (a) meet the minimum standard while (b) getting >>> 95% on moves between two users of the same program. People >>> started complaining before the GEDStan developers got out of >>> their hotel, because, hey, only 95%? >>> >>> Cheryl >>> ahhh, yes, I am still trying to forget GedStan discussions! >> >> You know what is wrong with Gedcom ? The format itself is something >> very powerful and I am surprised it is not in use in other fields. >> But the format is adapted to a hierarchical structure that is not >> compatible with a straight column-based database. So, what is wrong >> is not the format but its use. > >(G) The loose nut on the keyboard? > >The custom reports, saved as a csv and imported into a >spreadsheet, do a fair amount of the most common sorting I >want to do. When it doesn't, I go RTFM to see what I can find. > >ONE of the filters I'd like to have is something along the >lines of "show me everyone whose surname matches C-610 and >whose wife's given name contains Edna" Or, sometimes more >urgent, "show me all the Edna whose husband's given name >contains George." Yeah, I've got a spreadsheet, but it >ought to be do-able from within the program. > >Cheryl > > I think most programs will let you do that, but it is labor intensive, involves creating a subset that matches the first criterion exporting a gedcom, then operating on that subset and again exporting a gedcom, thus creating the subset that meets both criteria. Although I don't do SQL, there are users on the RM users list that apply SQL to the RM database to do just this kind of thing... without the messy need for creating gedcoms.
On 20.02.2013 07:05, Steve Hayes wrote: > On Tue, 19 Feb 2013 14:06:25 +0100, Janis Papanagnou > <janis_papanagnou@hotmail.com> wrote: > [...] >> >> awk '$2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 }' > > I substituted "Ellwood1.ged" for "!~" and got this: What did you intend to do? > > gawk: (FILENAME=ellwood1.ged FNR=40697) fatal: cannot open file `/@.*@/' for > reading (Invalid argument) The !~ is an operator which is to be read as "does not match". var ~ /pattern/ # var matches pattern var !~ /pattern/ # var doesn't match pattern var == "string" # var equals string var != "string" # var doesn't equal string > > Please forgive my ignorance -- I'm still feeling my way in the dark with this > stuff. Feel free to ask. There's also Arnold's very fine book about gawk available online: http://www.gnu.org/software/gawk/manual/gawk.html Janis
On Tue, 19 Feb 2013 14:06:25 +0100, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote: >On 19.02.2013 10:14, Steve Hayes wrote: >> In an earlier message I suggested using AWK to manipulate a GEDCOM file to >> solve a particular problem. >> >> That point tended to get lost in discussion of other points like using other >> ways to solve the problem, or discussion of flaws in the GEDCOM data model >> itself and proposals for its replacement, which I see as a separate question. >> >> What I would like to see is the development of a kind of library of AWK >> routines to manipulate GEDCOM files. Lots of genealogists have GEDCOM files, >> and some would like to make changes to them, or extract information from them >> in ways that might not be possible with other genealogy programs. > >You can consider the awk operations to be quite primitive for the >given syntax of the GEDCOM files, so a library seems not really >necessary; just write the awk command. I will give examples below. > >But first I'd like to ask for confirmation what a GEDCOM "field" >actually is, per semantic and syntax. Is it _one whole line_ with >a specific 4-letter tag in column 2, or is it the _rest_ of a line >where the first two columns are some number and a data type tag? > >To change data of a specific line identify the line by a pattern >on the type field (please note that Lew already gave such example). >To perform action on a "NAME" field, replacing "Service" by "S." > > awk '$2 == "NAME" { sub(/Service/, "S.") } { print $0 }' > >likewise negate the condition if you want to select type tags other >than name. > >To exclude tag names prom processing that seem to have a specific >meaning > > awk '$2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 }' I substituted "Ellwood1.ged" for "!~" and got this: gawk: (FILENAME=ellwood1.ged FNR=40697) fatal: cannot open file `/@.*@/' for reading (Invalid argument) Please forgive my ignorance -- I'm still feeling my way in the dark with this stuff. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
On Wed, 20 Feb 2013 02:49:31 +0000 (UTC), gazelle@shell.xmission.com (Kenny McCormack) wrote: >In article <kg0kdh$e7a$1@speranza.aioe.org>, >Tony Proctor <tony@proctor_NoMore_SPAM.net> wrote: >... >>A scripting library could be useful, but I would never use one based simply >>on a text-processing language. >> >>I mentioned XML, not because you wanted to use it but as an analogy. >>Scripting manipulation of XML is usually done with XSLT, which for all its >>faults and obfuscation deals with entities rather than text. > >I agree with (what I think is) your underlying point - which is that AWK is >not a good tool for parsing a programing language. Yes, it can be done, >and, yes, it has been done - but it is just not really the right tool. >People will say that AWK can do "anything", and I suppose it is probably >true that AWK can do anything that "standard C" can do - that is, as long as >it is pure text manipulation and no system calls. You can do this, of >course, by putting the whole program in the BEGIN block and completely >ignoring the "pattern/action loop" - i.e., the real value and point of AWK. That may be so, but GEDCOM files are not a programming language, they are data files. XML files are also data files, and some have suggested that GEDCOM be replaced by XML files, and perhaps they may happen one day. But right now people still use GEDCOM files, and I think a lot could be done with AWK to look at the data in different ways, or to massage the files before importing them into another genealogy program. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
On 2/20/2013 2:25 AM, Steve Hayes wrote: > On Wed, 20 Feb 2013 08:41:54 +0100, Janis Papanagnou > <janis_papanagnou@hotmail.com> wrote: > >> On 20.02.2013 07:05, Steve Hayes wrote: >>> On Tue, 19 Feb 2013 14:06:25 +0100, Janis Papanagnou >>> <janis_papanagnou@hotmail.com> wrote: >>> >> [...] >>>> >>>> awk '$2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 }' >>> >>> I substituted "Ellwood1.ged" for "!~" and got this: >> >> What did you intend to do? > > See what happened when I ran it on a file. Post what you tried because what you say you did `substituted "Ellwood1.ged" for "!~"` sounds like you replaced the regexp inequality operator with the name of a file which doesn't make sense. if you wanted to run Janis's script on a file named Ellwood1.ged you'd do: awk '$2 !~ /@.*@/ { sub(/Service/, "S.") } { print $0 }' Ellwood1.ged if you're on UNIX. If you're on Windows, it's different so tell us which OS you're on. Ed. > > I'll play with it some more. > >
In article <kg0kdh$e7a$1@speranza.aioe.org>, Tony Proctor <tony@proctor_NoMore_SPAM.net> wrote: ... >A scripting library could be useful, but I would never use one based simply >on a text-processing language. > >I mentioned XML, not because you wanted to use it but as an analogy. >Scripting manipulation of XML is usually done with XSLT, which for all its >faults and obfuscation deals with entities rather than text. I agree with (what I think is) your underlying point - which is that AWK is not a good tool for parsing a programing language. Yes, it can be done, and, yes, it has been done - but it is just not really the right tool. People will say that AWK can do "anything", and I suppose it is probably true that AWK can do anything that "standard C" can do - that is, as long as it is pure text manipulation and no system calls. You can do this, of course, by putting the whole program in the BEGIN block and completely ignoring the "pattern/action loop" - i.e., the real value and point of AWK. I once wrote a program to convert something in a programming language text. I started out doing it in AWK (thinking, "Oh, this will be easy - I just need to change this into that..."), and worked on it in AWK for quite a while before realizing it just wasn't the right tool. This was not an easy decision to make, but eventually, I gave up and re-did it in TXL, which worked very well and definitely *was* the right tool for the job. The hardest part was teaching myself TXL - which is no ordinary programmiing language (heh heh!). Once one understands what TXL is and how it works, doing stuff in it is pretty easy. I also agree with (whoever it was) the poster who wrote that feature-creep will eventually doom this project (writing a GEDCOM lib in AWK). -- "They shall be attended by boys graced with eternal youth, who to the beholder?s eyes will seem like sprinkled pearls. When you gaze upon that scene, you will behold a kingdom blissful and glorious." --- Qur'an 76:19 ---
On Tue, 19 Feb 2013 22:08:25 -0500, singhals <singhals@erols.com> wrote in soc.genealogy.computing: >Kenny McCormack wrote: > >> I also agree with (whoever it was) the poster who wrote that feature-creep >> will eventually doom this project (writing a GEDCOM lib in AWK). > >That's one of the things wrong with GED -- people wanted it >to do things it was /never/ intended to do, but developers >of GEDStand "tweaked" the standard to allow each s/w >developer to (a) meet the minimum standard while (b) getting >95% on moves between two users of the same program. People >started complaining before the GEDStan developers got out of >their hotel, because, hey, only 95%? > >Cheryl >ahhh, yes, I am still trying to forget GedStan discussions! You know what is wrong with Gedcom ? The format itself is something very powerful and I am surprised it is not in use in other fields. But the format is adapted to a hierarchical structure that is not compatible with a straight column-based database. So, what is wrong is not the format but its use. Denis -- Denis Beauregard - généalogiste émérite (FQSG) Les Français d'Amérique du Nord - www.francogene.com/genealogie--quebec/ French in North America before 1722 - www.francogene.com/quebec--genealogy/ Sur cédérom à 1780 - On CD-ROM to 1780
Kenny McCormack wrote: > I also agree with (whoever it was) the poster who wrote that feature-creep > will eventually doom this project (writing a GEDCOM lib in AWK). That's one of the things wrong with GED -- people wanted it to do things it was /never/ intended to do, but developers of GEDStand "tweaked" the standard to allow each s/w developer to (a) meet the minimum standard while (b) getting 95% on moves between two users of the same program. People started complaining before the GEDStan developers got out of their hotel, because, hey, only 95%? Cheryl ahhh, yes, I am still trying to forget GedStan discussions!
On Tue, 19 Feb 2013 08:57:27 -0600, Ed Morton <mortonspam@gmail.com> wrote: >On 2/19/2013 3:14 AM, Steve Hayes wrote: >> In an earlier message I suggested using AWK to manipulate a GEDCOM file to >> solve a particular problem. >> >> That point tended to get lost in discussion of other points like using other >> ways to solve the problem, or discussion of flaws in the GEDCOM data model >> itself and proposals for its replacement, which I see as a separate question. >> >> What I would like to see is the development of a kind of library of AWK >> routines to manipulate GEDCOM files. Lots of genealogists have GEDCOM files, >> and some would like to make changes to them, or extract information from them >> in ways that might not be possible with other genealogy programs. >> >> Here is a GEDCOM file. >> >> I tried to choose a short one to use as an example, which shows the structure >> of the file. > >OK, so that's presumably a good, representative input file for an awk script to >run against. Now - what might an output file look like and (briefly!) why? Thanks Ed -- see my reply to Janis for a couple of examples. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
On Tue, 19 Feb 2013 14:06:25 +0100, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote: >To change data of a specific line identify the line by a pattern >on the type field (please note that Lew already gave such example). >To perform action on a "NAME" field, replacing "Service" by "S." > > awk '$2 == "NAME" { sub(/Service/, "S.") } { print $0 }' > >likewise negate the condition if you want to select type tags other >than name. Thanks very much for these examples. When I get a chance I will play with them and see how they work. A Gedcom file has several parts, but the first part consists of information about individual people. The digit at the beginning is a level number, so 0 means data on a new individual likde this: >> 0 @I98BW-JC@ INDI 1 is the next leveil, with particular information about the individual, such as the NAME >> 1 NAME Emily Jane /THORNTON/ >> 2 GIVN Emily Jane >> 2 SURN THORNTON Where the next level is further information about name then information about the person's BIRTH >> 1 BIRT >> 2 DATE 1854 >> 2 PLAC Geelong, Victoria, Australia The kind of manipulation one might want to do would be to change place names to make them consistent throughout the file, so one migbht want to use abbreviations and change "Geelong, Victoria, Australia" to "Geelong, VIC, AUS". Another might be to produce a report of people who were born in Geelong, but died elsewhere in Australia. And so on. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
On 19.02.2013 20:39, Tony Proctor wrote: > "Steve Hayes" <hayesstw@telkomsa.net> wrote in message > news:h4p6i8l277lgaqsbl8ad70jkek4injed62@4ax.com... >> On Tue, 19 Feb 2013 11:02:25 +0000, Ian Goddard <goddai01@hotmail.co.uk> >> wrote: >> >> I think you are being too dismissive. >> >> I'm not taking about XML files, but about Gedcom files, and I'm not >> talking >> about a DOM, but about AWK. >> >> And I'm not taking about some hypothetical Platonic ideal of the perfect >> Gedcom replacement, but about the actual Gedcom files that millions of >> genealogists have on their computers now. >> >> These "you can't get there from here" comments are really not very >> helpful. >> >> > > I gave advice based on experience Steve. For every AWK file you write, I can > contrive a GEDCOM example that will break it. If GEDCOM has a formal definition you will be able to write an awk program to solve your task. According to a newer posting it seem that GEDCOM is a very primitive definition; I see no reason why you cannot process it using a language like awk. > Another post in this thread > mentioned ambiguities, and having to assume the availability of special > characters that won't occur in names or notes. Though, there's a way to handle those issue. It's not impossible, you just have to be aware of it; otherwise you'd get surprises. But that just boils down to "You need to know what to do.", which is valid for every approach. > > A scripting library could be useful, but I would never use one based simply > on a text-processing language. Why? Janis > > I mentioned XML, not because you wanted to use it but as an analogy. > Scripting manipulation of XML is usually done with XSLT, which for all its > faults and obfuscation deals with entities rather than text. > > Tony Proctor > > >
On 19.02.2013 20:15, Steve Hayes wrote: > On Tue, 19 Feb 2013 14:06:25 +0100, Janis Papanagnou > <janis_papanagnou@hotmail.com> wrote: > >> To change data of a specific line identify the line by a pattern >> on the type field (please note that Lew already gave such example). >> To perform action on a "NAME" field, replacing "Service" by "S." >> >> awk '$2 == "NAME" { sub(/Service/, "S.") } { print $0 }' >> >> likewise negate the condition if you want to select type tags other >> than name. > > Thanks very much for these examples. When I get a chance I will play with them > and see how they work. > > A Gedcom file has several parts, but the first part consists of information > about individual people. > > The digit at the beginning is a level number, so 0 means data on a new > individual likde this: > >>> 0 @I98BW-JC@ INDI > > 1 is the next leveil, with particular information about the individual, such > as the NAME > >>> 1 NAME Emily Jane /THORNTON/ >>> 2 GIVN Emily Jane >>> 2 SURN THORNTON > > Where the next level is further information about name > > then information about the person's BIRTH > >>> 1 BIRT >>> 2 DATE 1854 >>> 2 PLAC Geelong, Victoria, Australia > > The kind of manipulation one might want to do would be to change place names > to make them consistent throughout the file, so one migbht want to use > abbreviations and change "Geelong, Victoria, Australia" to "Geelong, VIC, > AUS". > > Another might be to produce a report of people who were born in Geelong, but > died elsewhere in Australia. > > And so on. I see. I'm sure awk fits very well for such manipulations. Given your example above one further step would be using a file with mapping information, thus letting awk do all that mapping without changing the awk program. Janis