On 16 Feb 2013 in soc.genealogy.computing, Ian Goddard wrote: > Actually it was Cygwin http://www.cygwin.com/ I was thinking of. > It's a long time since I used it (much easier these days to just run > Linux) but IIRC it provides its own Unix-style shell & doesn't then > depend on the conflict between the way command.exe & shell handle / > and \ etc. I'm not sure if the others do. Probably not, although I tend to avoid pathing issues by changing to the directory I want to use before I use the utilities. Having grep available in a 'DOS' shell is extremely convenient. -- Joe Makowiec http://makowiec.org/ Email: http://makowiec.org/contact/?Joe Usenet Improvement Project: http://twovoyagers.com/improve-usenet.org/
On Sat, 16 Feb 2013 17:03:31 -0500, Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote: >On Sat, 16 Feb 2013 15:52:50 -0500, Dennis Lee Bieber ><wlfraed@ix.netcom.com> declaimed the following in >soc.genealogy.computing: > > >> I suspect the find box should have enough of the GEDCOM to identify >> a NAME record, expressions to find the name parts >> >> (<*>) "(<*>)" (<*>) >> >> (if it is known that only a first "nick" last format is in use -- the < >> and > are beginning of word, end of word). Replace box would be >> >> \1 \2 \3 > > Problem: Word's regular expressions are greedy -- they match the >longest stretch found, and lack a way to specify "end-of-line" to limit >the match. > >Find: 1 NAME (*) "(*)" (*) >Replace: 1 NAME \1 \2 \3 > > Almost works, but without the end-of-line limit Word will start from >the first "1 NAME" record and match to the first ", then match to the >next ", and then match the rest of the document. > > Powershell command (all on one line, starting at get-content, put in >/your/ file paths) {as mentioned, Powershell 2.0 is a download for >WinXP, comes with Win7 -- Powershell 3.0 is a download for Win7} > >Microsoft Windows XP [Version 5.1.2600] >(C) Copyright 1985-2001 Microsoft Corp. > >E:\UserData\Wulfraed\My Documents>powershell >Windows PowerShell >Copyright (C) 2009 Microsoft Corporation. All rights reserved. > >PS E:\UserData\Wulfraed\My Documents> get-content >'E:\UserData\Wulfraed\My Documents\bieber.ged' | foreach {$_ -replace >'1 NAME (.*) "(.*)" (.*)', '1 NAME $1 $2 $3'} > >'E:\UserData\Wulfraed\My Documents\new.ged' > >PS E:\UserData\Wulfraed\My Documents> > > > get-content is basically the same as "type", | feed the lines output >by get-content to the foreach loop, $_ current line in loop, find "1 >NAME anything "more" rest" replace with "1 NAME firstgroup secondgroup >lastgroup" >write output to file... > > As this works on a line-by-line basis, it does not get greedy. BUT >if there are more than one pair of " on a NAME line, only one will be >matched. It did not affect "s found in other fields, in my test. > > If you have names in other fields, OR your GEDCOM puts names over >continuation lines, the regular expression will need to be changed >(continuation lines will be tricky -- may not be possible as a simple >one-liner). > >-- > Wulfraed Dennis Lee Bieber AF6VN > wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/ I think I'l have to read that a few times to get it - but thanks. Hugh
On Sat, 16 Feb 2013 09:04:31 -0600, Charlie Hoffpauir <invalid@invalid.com> wrote: >>I think if one knew enough about AWK (which I don't) one could develop quite a >>lot of useful routines for the manipulation of GEDCOM files. > >Steve, > >You sparked my interest. I still try to use GEDCOM Explorer (GEDX) >occasionally, so I'd like to hear more about AWK. I might have heard >of it, but it's too early in the morning for me to think clearly.... >What does AWK stand for? The AWK Programming Language? Users Manual and Tutorial? ? This document is an introduction to the use of AWK for manipulating ? text and the textual representation of numbers. This mouthful means that you ? can use AWK to manipulate words and numbers. ? ? 1. Basic Concepts? ? 1.1 AWK Programs? ? AWK programs consist of a series of PATTERNS and ACTIONS. Patterns ? are boolean (logical) expressions that are evaluated and if they are true ? (non-zero number or non-null string) then the associated Action is performed. ? Actions are program fragments in a "C" like language. ? ? The Pattern-Action statements comprising an AWK program are evaluated ? in turn for each input RECORD. That is, a Record is read and the Patterns in ? the program are evaluated in order, for each Pattern that succeeds, an Action ? is performed. For example:? NR == 5 { print }? ? is a simple program that prints the fifth line of a file. NR is a built-in ? variable that is equal to the number of records AWK has read so far. The ? double equal sign is the equality comparison operator from C.? ? As you can see from the above example, a Pattern is a naked expression ? and an Action is a compound statement or list of program statements enclosed ? in braces ({}).? ? You may omit the Action in a Pattern/Action statement in which case ? the default action is { print }. You may, on the other hand omit the Pattern ? which defaults to true, so that the Action is always taken. Finally if you ? omit both the Pattern and the Action you have a blank line, which is ignored. ? ? 1.2 Fields and Records? ? To AWK all data are divided into FIELDS and RECORDS. The definition ? of a field is any string of characters separated by the Field Separator or FS ? for short. Similarly a record is any string of characters separated by the ? Record Separator or RS.? ? In the simplest form a Field is a string of characters surrounded by ? white space (blanks or tabs,) and a Record is a line of text. You can make ? the Field Separator as complex as you like by providing your own REGULAR ? EXPRESSION for the FS. The Record Separator is limited to the null string "" ? or a newline "\n". The null string means that a blank line separates a multi-? line record, and the newline means that each line is a record.? ? You can refer to the Fields in the current Record with the dollar ($) ? operator:? ? $3 < 10 { print NR, $0 }? ____________________________________ There's more than that, of course, but that should give you the general idea. AWK comes styandard with Unix, and a variant, called GAWK with Linux. I use a DOS implementation. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
On Sat, 16 Feb 2013 11:19:26 -0600, Charlie Hoffpauir <invalid@invalid.com> wrote: >I think grep would work on making the changes to the GEDCOMs, but as >far as I know it's only used in Linux or Unix-like programs, which >I've played with but decided at my age it's too late to try to become >proficient in. Using word is the lazy way, you just figure out what >Word editing steps you have to make, then record it to a macro. If it >truns out that isolating the given name field in the GEDCOM is easy, >then it might not even be worthwhile to record it to a macro.... just >a couple of steps of search/replace. It would be super easy if you had >used some other character to define the used name.... it gets >difficult because the " mark is probably used in many places of text >where you don't wnat the " replaced by another character. I presumed from what I was told that it would be as simple as telling grep to look for the GED Sullivan Combo, find "X" or "*"and replace the quotes. It forms a new file but doesn't destroy the old. Of course everything is simple to that guy. I'm not sure why I would think anything is that simple. I don't recall using ""s for anything else. Hugh
On Sat, 16 Feb 2013 17:03:31 -0500, Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote: >On Sat, 16 Feb 2013 15:52:50 -0500, Dennis Lee Bieber ><wlfraed@ix.netcom.com> declaimed the following in >soc.genealogy.computing: > > >> I suspect the find box should have enough of the GEDCOM to identify >> a NAME record, expressions to find the name parts >> >> (<*>) "(<*>)" (<*>) >> >> (if it is known that only a first "nick" last format is in use -- the < >> and > are beginning of word, end of word). Replace box would be >> >> \1 \2 \3 > > Problem: Word's regular expressions are greedy -- they match the >longest stretch found, and lack a way to specify "end-of-line" to limit >the match. > >Find: 1 NAME (*) "(*)" (*) >Replace: 1 NAME \1 \2 \3 > > Almost works, but without the end-of-line limit Word will start from >the first "1 NAME" record and match to the first ", then match to the >next ", and then match the rest of the document. > > Powershell command (all on one line, starting at get-content, put in >/your/ file paths) {as mentioned, Powershell 2.0 is a download for >WinXP, comes with Win7 -- Powershell 3.0 is a download for Win7} > >Microsoft Windows XP [Version 5.1.2600] >(C) Copyright 1985-2001 Microsoft Corp. > >E:\UserData\Wulfraed\My Documents>powershell >Windows PowerShell >Copyright (C) 2009 Microsoft Corporation. All rights reserved. > >PS E:\UserData\Wulfraed\My Documents> get-content >'E:\UserData\Wulfraed\My Documents\bieber.ged' | foreach {$_ -replace >'1 NAME (.*) "(.*)" (.*)', '1 NAME $1 $2 $3'} > >'E:\UserData\Wulfraed\My Documents\new.ged' > >PS E:\UserData\Wulfraed\My Documents> > > > get-content is basically the same as "type", | feed the lines output >by get-content to the foreach loop, $_ current line in loop, find "1 >NAME anything "more" rest" replace with "1 NAME firstgroup secondgroup >lastgroup" >write output to file... > > As this works on a line-by-line basis, it does not get greedy. BUT >if there are more than one pair of " on a NAME line, only one will be >matched. It did not affect "s found in other fields, in my test. > > If you have names in other fields, OR your GEDCOM puts names over >continuation lines, the regular expression will need to be changed >(continuation lines will be tricky -- may not be possible as a simple >one-liner). Dennis, This looks easier than what I was doing. To solve the problems you see with Word I was going into Visual Basic for the macros.... I'll have to look into Powershell.... I'd never heard of it before.
Joe Makowiec wrote: > On 16 Feb 2013 in soc.genealogy.computing, Ian Goddard wrote: > >> grep looks for lines containing particular strings. But it doesn't >> edit things. awk and sed do. The thing is that these are all Unix >> commands. >> There is a package called Cygnus which makes them available on Windows > > There are also native Windows ports: > > http://gnuwin32.sourceforge.net/ > > http://sourceforge.net/projects/unxutils/ > > and probably more if you do a web search for [gnu utilities windows]. > Actually it was Cygwin http://www.cygwin.com/ I was thinking of. It's a long time since I used it (much easier these days to just run Linux) but IIRC it provides its own Unix-style shell & doesn't then depend on the conflict between the way command.exe & shell handle / and \ etc. I'm not sure if the others do. -- Ian The Hotmail address is my spam-bin. Real mail address is iang at austonley org uk
On 16 Feb 2013 in soc.genealogy.computing, Ian Goddard wrote: > grep looks for lines containing particular strings. But it doesn't > edit things. awk and sed do. The thing is that these are all Unix > commands. > There is a package called Cygnus which makes them available on Windows There are also native Windows ports: http://gnuwin32.sourceforge.net/ http://sourceforge.net/projects/unxutils/ and probably more if you do a web search for [gnu utilities windows]. -- Joe Makowiec http://makowiec.org/ Email: http://makowiec.org/contact/?Joe Usenet Improvement Project: http://twovoyagers.com/improve-usenet.org/
J. Hugh Sullivan wrote: > On Sat, 16 Feb 2013 09:04:31 -0600, Charlie Hoffpauir > <invalid@invalid.com> wrote: > >> You sparked my interest. I still try to use GEDCOM Explorer (GEDX) >> occasionally, so I'd like to hear more about AWK. I might have heard >> of it, but it's too early in the morning for me to think clearly.... >> What does AWK stand for? Ahoe, Weinberger & Kernighan - its authors. > After talking with my friend he suggests using GREP. Are you familiar > with GREP - Steve? grep looks for lines containing particular strings. But it doesn't edit things. awk and sed do. The thing is that these are all Unix commands. There is a package called Cygnus which makes them available on Windows but that's really intended to make Unixers feel at homw on Windows. They're not easy to use for the uninitiated - especially awk. Here, for example, are the instructions for sed: http://unixhelp.ed.ac.uk/CGI/man-cgi?sed -- Ian The Hotmail address is my spam-bin. Real mail address is iang at austonley org uk
On Sat, 16 Feb 2013 09:04:31 -0600, Charlie Hoffpauir <invalid@invalid.com> wrote: >You sparked my interest. I still try to use GEDCOM Explorer (GEDX) >occasionally, so I'd like to hear more about AWK. I might have heard >of it, but it's too early in the morning for me to think clearly.... >What does AWK stand for? After talking with my friend he suggests using GREP. Are you familiar with GREP - Steve? Hugh
On Fri, 15 Feb 2013 18:48:58 -0600, Charlie Hoffpauir <invalid@invalid.com> wrote: >If Legacy won't let you do the replacement in the given name field, >then it gets more complicated, but still doable. In essence, you >create the GEDCOM from Legacy with the names unchanged, then do the >search/replace on the GEDCOM. The reason it's more difficult there is >because then you have to write some sort of macro the do the >replacement only on quote marks found in the name field. I did a few >macros similar to that using Word to process the GEDCOMs generated by >FTM back before RM wouldn't do the import directly. I never needed macros to accomplish my purpose so I never got involved. A friend wrote macros 20 years ago to start checking stocks and voice report while he sat for coffee and breakfast. I got as far as the coffee and breakfast part. What might be even more interesting is to remove the quotes to export to RM and replace them to export back to Legacy. Bottom line I think the navigation of the family screen points me to Legacy. I have a Mensa friend who can do almost anything with programming. I might turn the problem over to him. He could probably solve the name source problem, too. But going through the routines everytime I wanted to switch between programs would get cumbersome. My conclusion is that there is no universal best. Meanwhile I will follow the exchange between you and Steve - I'm bound to learn something. Hugh
On Sat, 16 Feb 2013 16:51:50 GMT, Eagle@bellsouth.net (J. Hugh Sullivan) wrote: >On Sat, 16 Feb 2013 09:04:31 -0600, Charlie Hoffpauir ><invalid@invalid.com> wrote: > >>You sparked my interest. I still try to use GEDCOM Explorer (GEDX) >>occasionally, so I'd like to hear more about AWK. I might have heard >>of it, but it's too early in the morning for me to think clearly.... >>What does AWK stand for? > >After talking with my friend he suggests using GREP. Are you familiar >with GREP - Steve? > >Hugh I think grep would work on making the changes to the GEDCOMs, but as far as I know it's only used in Linux or Unix-like programs, which I've played with but decided at my age it's too late to try to become proficient in. Using word is the lazy way, you just figure out what Word editing steps you have to make, then record it to a macro. If it truns out that isolating the given name field in the GEDCOM is easy, then it might not even be worthwhile to record it to a macro.... just a couple of steps of search/replace. It would be super easy if you had used some other character to define the used name.... it gets difficult because the " mark is probably used in many places of text where you don't wnat the " replaced by another character.
On Sat, 16 Feb 2013 08:19:55 +0200, Steve Hayes <hayesstw@telkomsa.net> wrote: >On Fri, 15 Feb 2013 18:48:58 -0600, Charlie Hoffpauir <invalid@invalid.com> >wrote: <snip> >> >>If Legacy won't let you do the replacement in the given name field, >>then it gets more complicated, but still doable. In essence, you >>create the GEDCOM from Legacy with the names unchanged, then do the >>search/replace on the GEDCOM. The reason it's more difficult there is >>because then you have to write some sort of macro the do the >>replacement only on quote marks found in the name field. I did a few >>macros similar to that using Word to process the GEDCOMs generated by >>FTM back before RM wouldn't do the import directly. > >I think that is the kind of thing that AWK is designed to do fairly easily and >efficiently. > >I think if one knew enough about AWK (which I don't) one could develop quite a >lot of useful routines for the manipulation of GEDCOM files. Steve, You sparked my interest. I still try to use GEDCOM Explorer (GEDX) occasionally, so I'd like to hear more about AWK. I might have heard of it, but it's too early in the morning for me to think clearly.... What does AWK stand for?
On Fri, 15 Feb 2013 23:03:14 GMT, Eagle@bellsouth.net (J. Hugh Sullivan) wrote: >On Fri, 15 Feb 2013 11:19:10 -0600, Charlie Hoffpauir ><invalid@invalid.com> wrote: > >> >>Hugh, >> >>A question.... will Legacy allow you to do a search/replace like the >>Control-H feature in RM? If so, do a searh/replace on the given name >>field, searching for your " marks and replace with the | character (a >>vertical line found in the uppercase field above the \). > >Legacy allows search and replace but only for the categories they >allow - every fact field, I think. > >I trieh a "Hugh" (Given Name category) some time ago and it found >nothing to replace. I didn't see the ability to search and replace >quote marks. I'll look a little more. I think, if I recall correctly, that that is one of the things that PAF CAN do. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
On Fri, 15 Feb 2013 18:48:58 -0600, Charlie Hoffpauir <invalid@invalid.com> wrote: >On Fri, 15 Feb 2013 23:03:14 GMT, Eagle@bellsouth.net (J. Hugh >Sullivan) wrote: >>Legacy allows search and replace but only for the categories they >>allow - every fact field, I think. >> >>I trieh a "Hugh" (Given Name category) some time ago and it found >>nothing to replace. I didn't see the ability to search and replace >>quote marks. I'll look a little more. >> >>Thankee, >> >>Hugh > >If Legacy won't let you do the replacement in the given name field, >then it gets more complicated, but still doable. In essence, you >create the GEDCOM from Legacy with the names unchanged, then do the >search/replace on the GEDCOM. The reason it's more difficult there is >because then you have to write some sort of macro the do the >replacement only on quote marks found in the name field. I did a few >macros similar to that using Word to process the GEDCOMs generated by >FTM back before RM wouldn't do the import directly. I think that is the kind of thing that AWK is designed to do fairly easily and efficiently. I think if one knew enough about AWK (which I don't) one could develop quite a lot of useful routines for the manipulation of GEDCOM files. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
On Fri, 15 Feb 2013 11:19:10 -0600, Charlie Hoffpauir <invalid@invalid.com> wrote: > >Hugh, > >A question.... will Legacy allow you to do a search/replace like the >Control-H feature in RM? If so, do a searh/replace on the given name >field, searching for your " marks and replace with the | character (a >vertical line found in the uppercase field above the \). Legacy allows search and replace but only for the categories they allow - every fact field, I think. I trieh a "Hugh" (Given Name category) some time ago and it found nothing to replace. I didn't see the ability to search and replace quote marks. I'll look a little more. Thankee, Hugh
On Fri, 15 Feb 2013 23:03:14 GMT, Eagle@bellsouth.net (J. Hugh Sullivan) wrote: >On Fri, 15 Feb 2013 11:19:10 -0600, Charlie Hoffpauir ><invalid@invalid.com> wrote: > >> >>Hugh, >> >>A question.... will Legacy allow you to do a search/replace like the >>Control-H feature in RM? If so, do a searh/replace on the given name >>field, searching for your " marks and replace with the | character (a >>vertical line found in the uppercase field above the \). > >Legacy allows search and replace but only for the categories they >allow - every fact field, I think. > >I trieh a "Hugh" (Given Name category) some time ago and it found >nothing to replace. I didn't see the ability to search and replace >quote marks. I'll look a little more. > >Thankee, > >Hugh If Legacy won't let you do the replacement in the given name field, then it gets more complicated, but still doable. In essence, you create the GEDCOM from Legacy with the names unchanged, then do the search/replace on the GEDCOM. The reason it's more difficult there is because then you have to write some sort of macro the do the replacement only on quote marks found in the name field. I did a few macros similar to that using Word to process the GEDCOMs generated by FTM back before RM wouldn't do the import directly.
Steve Hayes wrote: > On Fri, 15 Feb 2013 10:43:24 +0000, Ian Goddard <goddai01@hotmail.co.uk> > wrote: > >> Steve Hayes wrote: >>> >>> If only they had continued to develop OS/2, which could run Windows in a >>> window as well. >>> >> >> Virtualbox lets you run Windows in a window ;) It also lets you run >> individual Windows apps in their own individual windows. > > And copy and paste between them? Yup. Host Debian Linux 64bit. Guest Win 7 32bit. Start guest full-screen (ie it occupies the whole of the screen not just appearing inside a maximised host window), fire up Word & Publisher, de-maximise so they appear side-by side. Enter a Header 1 & some body text into Word. Switch guest to seamless mode (ie the Windows apps appear in separate host windows). Copy text from Word. Go to Publisher & select Copy special. Publisher asks if I want to past as a Word object, select Yes & it does. Note I'm not at all familiar with Publisher so I'm not sure what other options there'd be. Also copies & pastes from Notepad to Notepad. Notepad is rather easier on the eye as Word & Publisher have so much clutter around them that they get scaled down to near unreadability when running in windows small enough to show both on a laptop. Can also copy & paste back & forth between host & guest OK for text but, as the applications on the two platforms have different semantics, WP formatting is lost. I haven't the memory on this laptop to try run multiple guests so I don't know if I could copy & paste between different guests. -- Ian The Hotmail address is my spam-bin. Real mail address is iang at austonley org uk
On Fri, 15 Feb 2013 10:43:24 +0000, Ian Goddard <goddai01@hotmail.co.uk> wrote: >Steve Hayes wrote: >> >> If only they had continued to develop OS/2, which could run Windows in a >> window as well. >> > >Virtualbox lets you run Windows in a window ;) It also lets you run >individual Windows apps in their own individual windows. And copy and paste between them? -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
On Thu, 14 Feb 2013 17:53:03 GMT, Eagle@bellsouth.net (J. Hugh Sullivan) wrote: >On Thu, 14 Feb 2013 10:15:42 -0600, Charlie Hoffpauir ><invalid@invalid.com> wrote: > >>Hugh, I guess I don't understand the part about RM picking it up >>twice. Do you mean if it's in quotes in Legacy, then GEDCOMed into RM, >>it picks the quoted name twice? If that's the case, then I think what is >>happening is that RM Thinks the quotes mean it's a nickname, which RM >>puts in quotes and includes when presenting the name. > >Thanks, Charlie, you nailed it. > >>If you don't use nicknames, then it might be a fairly simple task to just remove all >>the nicknames after importing the GEDCOM into RM. After saying that, >>I'll admit I don't know how to do that (other then doing it on the >>GEDCOM), but ther's a guy that posts often to the RM mail list that >>has written lots of SQL scripts to operate on the Rm database, and I'm >>sure he could give some suggestions. > >I do use a few nicknames. I have posted both problems to the RM user >group. Some comments, but no cures were posted to the first problem. I >didn't even get a response to the double name post. > >I "use" programs - I don't know how to "maneuver" the guts. That's the >problem with growing up before computers. > >Bruce and I used to communicate a lot back in the old days when we >were insulting the Banner Blue guy and he incorporated several >features I requested. But I don't think he knows me any more. He >outgrew me quickly. > >I like one thing in Legacy more than RM - the Family Screen. It >presents more info and I think the navigation is much easier. It's an >"old dog, new tricks" thing. > >Until a few years ago I made an effort to try every available >genealogy program. After trying so many RM was my "first love" - I >hate to give it up. > >I just tried PAF and it navigates the Family screen like RM - Bye, >Bye. I'd like to say it is a good starter program but Cheryl would >kick me where I bend in the middle. > >Hugh Hugh, A question.... will Legacy allow you to do a search/replace like the Control-H feature in RM? If so, do a searh/replace on the given name field, searching for your " marks and replace with the | character (a vertical line found in the uppercase field above the \). Then export the GEDCOM and import it into RM. I don't think RM will mess with that on the import. Then repeat the search/replace process in RM this time replacing instances of | with ".
Steve Hayes wrote: > > If only they had continued to develop OS/2, which could run Windows in a > window as well. > Virtualbox lets you run Windows in a window ;) It also lets you run individual Windows apps in their own individual windows. -- Ian The Hotmail address is my spam-bin. Real mail address is iang at austonley org uk