On Sat, 16 Feb 2013 17:03:31 -0500, Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote: >On Sat, 16 Feb 2013 15:52:50 -0500, Dennis Lee Bieber ><wlfraed@ix.netcom.com> declaimed the following in >soc.genealogy.computing: > > >> I suspect the find box should have enough of the GEDCOM to identify >> a NAME record, expressions to find the name parts >> >> (<*>) "(<*>)" (<*>) >> >> (if it is known that only a first "nick" last format is in use -- the < >> and > are beginning of word, end of word). Replace box would be >> >> \1 \2 \3 > > Problem: Word's regular expressions are greedy -- they match the >longest stretch found, and lack a way to specify "end-of-line" to limit >the match. > >Find: 1 NAME (*) "(*)" (*) >Replace: 1 NAME \1 \2 \3 > > Almost works, but without the end-of-line limit Word will start from >the first "1 NAME" record and match to the first ", then match to the >next ", and then match the rest of the document. > > Powershell command (all on one line, starting at get-content, put in >/your/ file paths) {as mentioned, Powershell 2.0 is a download for >WinXP, comes with Win7 -- Powershell 3.0 is a download for Win7} > >Microsoft Windows XP [Version 5.1.2600] >(C) Copyright 1985-2001 Microsoft Corp. > >E:\UserData\Wulfraed\My Documents>powershell >Windows PowerShell >Copyright (C) 2009 Microsoft Corporation. All rights reserved. > >PS E:\UserData\Wulfraed\My Documents> get-content >'E:\UserData\Wulfraed\My Documents\bieber.ged' | foreach {$_ -replace >'1 NAME (.*) "(.*)" (.*)', '1 NAME $1 $2 $3'} > >'E:\UserData\Wulfraed\My Documents\new.ged' > >PS E:\UserData\Wulfraed\My Documents> > > > get-content is basically the same as "type", | feed the lines output >by get-content to the foreach loop, $_ current line in loop, find "1 >NAME anything "more" rest" replace with "1 NAME firstgroup secondgroup >lastgroup" >write output to file... > > As this works on a line-by-line basis, it does not get greedy. BUT >if there are more than one pair of " on a NAME line, only one will be >matched. It did not affect "s found in other fields, in my test. > > If you have names in other fields, OR your GEDCOM puts names over >continuation lines, the regular expression will need to be changed >(continuation lines will be tricky -- may not be possible as a simple >one-liner). > >-- > Wulfraed Dennis Lee Bieber AF6VN > wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/ I think I'l have to read that a few times to get it - but thanks. Hugh