Why not sort each block first? (To a temporary file of course, leaving the transcribers temporary file as it was.) John Fairlie Mail us at ..... john@fairlie.plus.com john.fairlie@blueyonder.co.uk Home page... http://www.fairlie.plus.com -----Original Message----- From: Dave Mayall [mailto:dave@research-group.co.uk] Sent: Tuesday, December 23, 2003 11:10 AM To: FREEBMD-DISCUSS-L@rootsweb.com Subject: Re: Matching entries ----- Original Message ----- From: "Dave Mayall" <david.mayall@ukonline.co.uk> To: "Christopher Richards" <cmrichards@blueyonder.co.uk> Cc: <dave@research-group.co.uk> Sent: Tuesday, December 23, 2003 7:29 AM Subject: Re: Matching entries > On Mon, 22 Dec 2003 22:22:14 -0000, you wrote: > > >Fair comment.. > >The two entries are under Births Sept 1845. Name James Friend, Manchester > >20 590. > >The one transcribed by "sgaunt" has the district as MachesterXX and the one > >by "James Fox" has the district as Manchester20. > >Otherwise they are identical. > >Christopher Richards > > OK, I'll check it out, and let you know. Well, about half an hour of investigation has produced an answer, and as definitive answers are always better than guesses, you all have to suffer the explanation! FreeBMD doesn't just match individual records, it considers records in blocks (we call them accessions), and looks for alignment between blocks of records as part of the match process. This process causes apparently identical records not to be matched if there is a discrepancy in sequence. First, if we look at a block from the 2 relevant accessions; Submitted by James Fox (a ONENAME file) FRIEND,Elizabeth,Eastry,1845,Sep,5,144,,B FRIEND,James,Lambeth,1845,Sep,4,309,,B FRIEND,James,Manchester,1845,Sep,20,590,,B FRIEND,James Pizzey,Orsett,1845,Sep,12,181,,B FRIEND,Jane,Lambeth,1845,Sep,4,225,,B Submitted by Steve Gaunt (a SEQUENCED file) Friend,Elizabeth,Eastry,V,144 Friend,James,Manchester,XX,590 Friend,James,Lambeth,IV,309 Friend,James Pizzey,Orsett,XII,181 Friend,Jane,Lambeth,IV,225 Steve's file maintains the order from the index. James' is sorted, and reverses the Manchester and Lambeth records. Next we need to look at the way the matching process works..... [I wonder how many people are now saying "tell me no more" :-)] The process will take one file first (It is possible to work out which it will take first, but that is detail beyond that which we need to know for this explanation), and in this case it will take James Fox's file. The alignment Now looks like this; FRIEND,Elizabeth,Eastry,5,144 FRIEND,James,Lambeth,4,309 FRIEND,James,Manchester,20,590 FRIEND,James Pizzey,Orsett,12,181 FRIEND,Jane,Lambeth,4,225 Now it attempts to align Steve's file, and discovers that Steve has an extra record for James in Manchester between Elizabeth and James in Lambeth, so it assumes that James Fox has omitted an entry. At this stage the alignment looks like; FRIEND,Elizabeth,Eastry,5,144|Friend,Elizabeth,Eastry,V,144 --------------------------------|Friend,James,Manchester,XX,590 FRIEND,James,Lambeth,4,309|Friend,James,Lambeth,IV,309 FRIEND,James,Manchester,20,590 FRIEND,James Pizzey,Orsett,12,181 FRIEND,Jane,Lambeth,4,225 Then, it discovers that the next record in Steve's file is James Pizey, and that Steve seems to have missed out James in Manchester! The alignment now looks like this; FRIEND,Elizabeth,Eastry,5,144|Friend,Elizabeth,Eastry,V,144 --------------------------------|Friend,James,Manchester,XX,590 FRIEND,James,Lambeth,4,309|Friend,James,Lambeth,IV,309 FRIEND,James,Manchester,20,590|------------------------------------ FRIEND,James Pizzey,Orsett,12,181|FRIEND,James Pizzey,Orsett,12,181 FRIEND,Jane,Lambeth,4,225|FRIEND,Jane,Lambeth,4,225 This is the final alignment, and it now collapses it into the search table, marking each entry as single or double keyed; FRIEND,Elizabeth,Eastry,5,144,D Friend,James,Manchester,20,590,S FRIEND,James,Lambeth,4,309,D FRIEND,James,Manchester,20,590,S FRIEND,James Pizzey,Orsett,12,181,D FRIEND,Jane,Lambeth,4,225,D Now you and I can spot exactly where this went wrong, but teaching a computer to work round things like this is a VERY tricky thing indeed! Obviously something that Barrie and I need to think about over Christmas. ============================== To join Ancestry.com and access our 1.2 billion online genealogy records, go to: http://www.ancestry.com/rd/redir.asp?targetid=571&sourceid=1237