In a message dated 1/6/04 1:03:34 AM US Mountain Standard Time, FREEBMD-DISCUSS-D-request@rootsweb.com writes: Hi All A long time ago I went through the need for a ruler to mark my transcribing position Surely you jest! Am still laughing....Jj
On Mon, 5 Jan 2004 15:16:56 -0500, you wrote: >Does anyone know what percentage of the first typing has been completed? Approximately 65% of 1837-1900 -- Dave Mayall
Hi All A long time ago I went through the need for a ruler to mark my transcribing position. I had the ruler suspended from the top of the monitor with a nifty ratchet like gizmo, operated by my left foot, to lower the ruler in suitable increments. My right foot operated the ruler return mechanism (I had to be careful as a heavy right foot sent the whole contraption flying). I then found it easier to paint a ruler on the monitor, about 1/3 up the screen. I keep the current line just above the "ruler" when transcribing. The draw back of having a ruler painted on the screen is that you have to type around it. cheers Bob Phillips ----- Original Message ----- From: "camred" <camred@ntlworld.com> To: <FREEBMD-DISCUSS-L@rootsweb.com> Sent: Monday, January 05, 2004 7:54 PM Subject: Ruler in WinBMD > I think it would help > > -- > Victor > > > I transcribe for FreeBMD at http://freebmd.rootsweb.com/ > > *********************************************************** > All incoming & outgoing mail is virus checked by Norton AV > *********************************************************** > > > > > ============================== > Gain access to over two billion names including the new Immigration > Collection with an Ancestry.com free trial. Click to learn more. > http://www.ancestry.com/rd/redir.asp?targetid=4930&sourceid=1237 > > --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.558 / Virus Database: 350 - Release Date: 02/01/2004
I think it would help -- Victor I transcribe for FreeBMD at http://freebmd.rootsweb.com/ *********************************************************** All incoming & outgoing mail is virus checked by Norton AV ***********************************************************
YES please Loraine
Does anyone know what percentage of the first typing has been completed? Lynda
I quite like the idea of the red line, but am having trouble visualising using it. It sounds as if it could slow down transcribing for some yet be helpful to others, perhaps if it could be switched on and off. The way I work is with the scan in a small window only showing 1 column and the WinBMD window covering the bottom half, so when I have to move the scan up to view more entries, the scan window covers the WinBMD window which I then have to bring to the front, if I had to do this with every entry to move the red line it would take me forever to do one page. I hope this makes sense!! Lucille ----- Original Message ----- From: "Ian Brooke" <ianbrooke@hotmail.com> To: <FreeBMD-Admins-L@rootsweb.com> Sent: Monday, January 05, 2004 3:10 AM Subject: Re: Speed & WinBMD > This issue has been raised numerous times in the past and I've given it a lot of consideration. I remain unconvinced that a mechanism similar to BMDVerify is appropriate for WinBMD and believe that the two programs should remain seperate and distinct. I also think that the automatic advance cursor in BMDVerify would, in WinBMD, be as liable to error (considering mis-aligned scans, different scan sizes, differing magnifications etc) as a manual method and I am not prepared to introduce a further potential source of error in an attempt to remove another. The differing ways that the two programs operate seems to me to currently allow mistakes to be noticed and is therefore a help rather than a hinderance. > Having said that, I am prepared to introduce a manually moved ruler, ie a narrow red line on top of the scan which can be moved using (probably) Control/Arrow (unless anyone has a better combination) and can be used to underline the row currently being transcribed but does NOT move automatically. > If anyone has any comments on this would they please address them to the Discuss list (FREEBMD-DISCUSS-L@rootsweb.com) which I think is more appropriate. > Ian > > > ==== FreeBMD-Admins Mailing List ==== > FreeBMD Transcribers homepage > http://freebmd.rootsweb.com/vindex.shtml > > ============================== > Gain access to over two billion names including the new Immigration > Collection with an Ancestry.com free trial. Click to learn more. > http://www.ancestry.com/rd/redir.asp?targetid=4930&sourceid=1237 >
The latest update shows an increase of 1,691,542 in the total number of records and an increase of 1,016,271 in the number of unique/distinct records. This analysis compares this month's breakdown of distinct records with last month's breakdown. Future analyses could be based on total records, distinct records or unique records, but my initial leaning is towards total records. I would welcome other users' thoughts. There are 526,840 more Births. The big increases are for 1907 (107,832); 1874 (80,991) and 1873 (78,634) but there are 4 other years with increases over 18K: 1891 (47,586); 1890 (35,808); 1870 (32,805) and 1842 (26,664) There are 99,682 more Marriages. There are just 2 years with increases over 10K: 1860 (23,027) and 1839 (12,946) There are 389,749 more Deaths. There are 8 years with increases over 13K: 1888 (49,726); 1841 (43,448); 1852 (37,921); 1858 (37,491); 1910 (30,014); 1865 (22,057); 1884 (21,377) and 1882 (19,154) The site gives an explanation of the terms: total records, distinct records and unique records, and a lot more graphs are available. Happy searching/transcribing Peter Dauncey
Dave Mayall wrote: > On Wed, 24 Dec 2003 00:58:36 +0000, you wrote: > > >>Dave Mayall wrote: >> >>>>Thanks Dave - I wondered if James Fox was a "onenamer" and if this would >>>>prove to be the answer. And I did read to the end of your reply and, I >>>>think, understood the principle. >>>>Now another question - what happens when the second keying is done and >>> >>>there >>> >>> >>>>are now three entries for some of this particular batch of "Friends"? >>> >>> >>>As the software stands at present, the double keying would align onto >>>Steve's keying, and we would see one single keyed and one double keyed in >>>the results For the Manchester James. >>> >>>All the others would simply appear as double keyed (the system makes no >>>distinction between double or triple or 21-fold keying of a record) >>> >>>The challenge that faces us is to tweak the software to cope with this, and >>>it will be a challenging bit of software to write, for there are no easy >>>fixes here. >> >>I'd note that one-name submissions treat each line as a separate >>accession. If a one-name is submitted as something else, it breaks the >>system. The easy fix is to fix the type. > > > No, RANDOM treats each line as a separate Accession. ONENAME is a > halfway house which behaves like SEQUENCED except that it includes an > implicit +BREAK (and hence a new Accession) on change of surname. I guess that strictly ONENAME is wrong then, sadly. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
On Wed, 24 Dec 2003 00:58:36 +0000, you wrote: >Dave Mayall wrote: >>>Thanks Dave - I wondered if James Fox was a "onenamer" and if this would >>>prove to be the answer. And I did read to the end of your reply and, I >>>think, understood the principle. >>>Now another question - what happens when the second keying is done and >> >> there >> >>>are now three entries for some of this particular batch of "Friends"? >> >> >> As the software stands at present, the double keying would align onto >> Steve's keying, and we would see one single keyed and one double keyed in >> the results For the Manchester James. >> >> All the others would simply appear as double keyed (the system makes no >> distinction between double or triple or 21-fold keying of a record) >> >> The challenge that faces us is to tweak the software to cope with this, and >> it will be a challenging bit of software to write, for there are no easy >> fixes here. > >I'd note that one-name submissions treat each line as a separate >accession. If a one-name is submitted as something else, it breaks the >system. The easy fix is to fix the type. No, RANDOM treats each line as a separate Accession. ONENAME is a halfway house which behaves like SEQUENCED except that it includes an implicit +BREAK (and hence a new Accession) on change of surname. -- Dave Mayall
Dave Mayall wrote: >>Thanks Dave - I wondered if James Fox was a "onenamer" and if this would >>prove to be the answer. And I did read to the end of your reply and, I >>think, understood the principle. >>Now another question - what happens when the second keying is done and > > there > >>are now three entries for some of this particular batch of "Friends"? > > > As the software stands at present, the double keying would align onto > Steve's keying, and we would see one single keyed and one double keyed in > the results For the Manchester James. > > All the others would simply appear as double keyed (the system makes no > distinction between double or triple or 21-fold keying of a record) > > The challenge that faces us is to tweak the software to cope with this, and > it will be a challenging bit of software to write, for there are no easy > fixes here. I'd note that one-name submissions treat each line as a separate accession. If a one-name is submitted as something else, it breaks the system. The easy fix is to fix the type. Anything presorted should be classed as one-name. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
> Thanks Dave - I wondered if James Fox was a "onenamer" and if this would > prove to be the answer. And I did read to the end of your reply and, I > think, understood the principle. > Now another question - what happens when the second keying is done and there > are now three entries for some of this particular batch of "Friends"? As the software stands at present, the double keying would align onto Steve's keying, and we would see one single keyed and one double keyed in the results For the Manchester James. All the others would simply appear as double keyed (the system makes no distinction between double or triple or 21-fold keying of a record) The challenge that faces us is to tweak the software to cope with this, and it will be a challenging bit of software to write, for there are no easy fixes here.
Thanks Dave - I wondered if James Fox was a "onenamer" and if this would prove to be the answer. And I did read to the end of your reply and, I think, understood the principle. Now another question - what happens when the second keying is done and there are now three entries for some of this particular batch of "Friends"? Christopher Richards ----- Original Message ----- From: "Dave Mayall" <dave@research-group.co.uk> To: <FREEBMD-DISCUSS-L@rootsweb.com> Sent: Tuesday, December 23, 2003 11:33 AM Subject: Re: Matching entries > > Why not sort each block first? > > Because that would completely defeat the object of doing the alignment in > the first place! > > The whole purpose is to trap missing records, and we can only do that by > looking at them in the as transcribed order. > > > ============================== > To join Ancestry.com and access our 1.2 billion online genealogy records, go to: > http://www.ancestry.com/rd/redir.asp?targetid=571&sourceid=1237 > >
> Why not sort each block first? Because that would completely defeat the object of doing the alignment in the first place! The whole purpose is to trap missing records, and we can only do that by looking at them in the as transcribed order.
Why not sort each block first? (To a temporary file of course, leaving the transcribers temporary file as it was.) John Fairlie Mail us at ..... john@fairlie.plus.com john.fairlie@blueyonder.co.uk Home page... http://www.fairlie.plus.com -----Original Message----- From: Dave Mayall [mailto:dave@research-group.co.uk] Sent: Tuesday, December 23, 2003 11:10 AM To: FREEBMD-DISCUSS-L@rootsweb.com Subject: Re: Matching entries ----- Original Message ----- From: "Dave Mayall" <david.mayall@ukonline.co.uk> To: "Christopher Richards" <cmrichards@blueyonder.co.uk> Cc: <dave@research-group.co.uk> Sent: Tuesday, December 23, 2003 7:29 AM Subject: Re: Matching entries > On Mon, 22 Dec 2003 22:22:14 -0000, you wrote: > > >Fair comment.. > >The two entries are under Births Sept 1845. Name James Friend, Manchester > >20 590. > >The one transcribed by "sgaunt" has the district as MachesterXX and the one > >by "James Fox" has the district as Manchester20. > >Otherwise they are identical. > >Christopher Richards > > OK, I'll check it out, and let you know. Well, about half an hour of investigation has produced an answer, and as definitive answers are always better than guesses, you all have to suffer the explanation! FreeBMD doesn't just match individual records, it considers records in blocks (we call them accessions), and looks for alignment between blocks of records as part of the match process. This process causes apparently identical records not to be matched if there is a discrepancy in sequence. First, if we look at a block from the 2 relevant accessions; Submitted by James Fox (a ONENAME file) FRIEND,Elizabeth,Eastry,1845,Sep,5,144,,B FRIEND,James,Lambeth,1845,Sep,4,309,,B FRIEND,James,Manchester,1845,Sep,20,590,,B FRIEND,James Pizzey,Orsett,1845,Sep,12,181,,B FRIEND,Jane,Lambeth,1845,Sep,4,225,,B Submitted by Steve Gaunt (a SEQUENCED file) Friend,Elizabeth,Eastry,V,144 Friend,James,Manchester,XX,590 Friend,James,Lambeth,IV,309 Friend,James Pizzey,Orsett,XII,181 Friend,Jane,Lambeth,IV,225 Steve's file maintains the order from the index. James' is sorted, and reverses the Manchester and Lambeth records. Next we need to look at the way the matching process works..... [I wonder how many people are now saying "tell me no more" :-)] The process will take one file first (It is possible to work out which it will take first, but that is detail beyond that which we need to know for this explanation), and in this case it will take James Fox's file. The alignment Now looks like this; FRIEND,Elizabeth,Eastry,5,144 FRIEND,James,Lambeth,4,309 FRIEND,James,Manchester,20,590 FRIEND,James Pizzey,Orsett,12,181 FRIEND,Jane,Lambeth,4,225 Now it attempts to align Steve's file, and discovers that Steve has an extra record for James in Manchester between Elizabeth and James in Lambeth, so it assumes that James Fox has omitted an entry. At this stage the alignment looks like; FRIEND,Elizabeth,Eastry,5,144|Friend,Elizabeth,Eastry,V,144 --------------------------------|Friend,James,Manchester,XX,590 FRIEND,James,Lambeth,4,309|Friend,James,Lambeth,IV,309 FRIEND,James,Manchester,20,590 FRIEND,James Pizzey,Orsett,12,181 FRIEND,Jane,Lambeth,4,225 Then, it discovers that the next record in Steve's file is James Pizey, and that Steve seems to have missed out James in Manchester! The alignment now looks like this; FRIEND,Elizabeth,Eastry,5,144|Friend,Elizabeth,Eastry,V,144 --------------------------------|Friend,James,Manchester,XX,590 FRIEND,James,Lambeth,4,309|Friend,James,Lambeth,IV,309 FRIEND,James,Manchester,20,590|------------------------------------ FRIEND,James Pizzey,Orsett,12,181|FRIEND,James Pizzey,Orsett,12,181 FRIEND,Jane,Lambeth,4,225|FRIEND,Jane,Lambeth,4,225 This is the final alignment, and it now collapses it into the search table, marking each entry as single or double keyed; FRIEND,Elizabeth,Eastry,5,144,D Friend,James,Manchester,20,590,S FRIEND,James,Lambeth,4,309,D FRIEND,James,Manchester,20,590,S FRIEND,James Pizzey,Orsett,12,181,D FRIEND,Jane,Lambeth,4,225,D Now you and I can spot exactly where this went wrong, but teaching a computer to work round things like this is a VERY tricky thing indeed! Obviously something that Barrie and I need to think about over Christmas. ============================== To join Ancestry.com and access our 1.2 billion online genealogy records, go to: http://www.ancestry.com/rd/redir.asp?targetid=571&sourceid=1237
----- Original Message ----- From: "Dave Mayall" <david.mayall@ukonline.co.uk> To: "Christopher Richards" <cmrichards@blueyonder.co.uk> Cc: <dave@research-group.co.uk> Sent: Tuesday, December 23, 2003 7:29 AM Subject: Re: Matching entries > On Mon, 22 Dec 2003 22:22:14 -0000, you wrote: > > >Fair comment.. > >The two entries are under Births Sept 1845. Name James Friend, Manchester > >20 590. > >The one transcribed by "sgaunt" has the district as MachesterXX and the one > >by "James Fox" has the district as Manchester20. > >Otherwise they are identical. > >Christopher Richards > > OK, I'll check it out, and let you know. Well, about half an hour of investigation has produced an answer, and as definitive answers are always better than guesses, you all have to suffer the explanation! FreeBMD doesn't just match individual records, it considers records in blocks (we call them accessions), and looks for alignment between blocks of records as part of the match process. This process causes apparently identical records not to be matched if there is a discrepancy in sequence. First, if we look at a block from the 2 relevant accessions; Submitted by James Fox (a ONENAME file) FRIEND,Elizabeth,Eastry,1845,Sep,5,144,,B FRIEND,James,Lambeth,1845,Sep,4,309,,B FRIEND,James,Manchester,1845,Sep,20,590,,B FRIEND,James Pizzey,Orsett,1845,Sep,12,181,,B FRIEND,Jane,Lambeth,1845,Sep,4,225,,B Submitted by Steve Gaunt (a SEQUENCED file) Friend,Elizabeth,Eastry,V,144 Friend,James,Manchester,XX,590 Friend,James,Lambeth,IV,309 Friend,James Pizzey,Orsett,XII,181 Friend,Jane,Lambeth,IV,225 Steve's file maintains the order from the index. James' is sorted, and reverses the Manchester and Lambeth records. Next we need to look at the way the matching process works..... [I wonder how many people are now saying "tell me no more" :-)] The process will take one file first (It is possible to work out which it will take first, but that is detail beyond that which we need to know for this explanation), and in this case it will take James Fox's file. The alignment Now looks like this; FRIEND,Elizabeth,Eastry,5,144 FRIEND,James,Lambeth,4,309 FRIEND,James,Manchester,20,590 FRIEND,James Pizzey,Orsett,12,181 FRIEND,Jane,Lambeth,4,225 Now it attempts to align Steve's file, and discovers that Steve has an extra record for James in Manchester between Elizabeth and James in Lambeth, so it assumes that James Fox has omitted an entry. At this stage the alignment looks like; FRIEND,Elizabeth,Eastry,5,144|Friend,Elizabeth,Eastry,V,144 --------------------------------|Friend,James,Manchester,XX,590 FRIEND,James,Lambeth,4,309|Friend,James,Lambeth,IV,309 FRIEND,James,Manchester,20,590 FRIEND,James Pizzey,Orsett,12,181 FRIEND,Jane,Lambeth,4,225 Then, it discovers that the next record in Steve's file is James Pizey, and that Steve seems to have missed out James in Manchester! The alignment now looks like this; FRIEND,Elizabeth,Eastry,5,144|Friend,Elizabeth,Eastry,V,144 --------------------------------|Friend,James,Manchester,XX,590 FRIEND,James,Lambeth,4,309|Friend,James,Lambeth,IV,309 FRIEND,James,Manchester,20,590|------------------------------------ FRIEND,James Pizzey,Orsett,12,181|FRIEND,James Pizzey,Orsett,12,181 FRIEND,Jane,Lambeth,4,225|FRIEND,Jane,Lambeth,4,225 This is the final alignment, and it now collapses it into the search table, marking each entry as single or double keyed; FRIEND,Elizabeth,Eastry,5,144,D Friend,James,Manchester,20,590,S FRIEND,James,Lambeth,4,309,D FRIEND,James,Manchester,20,590,S FRIEND,James Pizzey,Orsett,12,181,D FRIEND,Jane,Lambeth,4,225,D Now you and I can spot exactly where this went wrong, but teaching a computer to work round things like this is a VERY tricky thing indeed! Obviously something that Barrie and I need to think about over Christmas.
Dick, This example will cause two entries to be displayed. This is the correct behaviour because the two entries are different. If the entries had been "MANCHESTER *" and "Manchester 20" then it could be resolved. It isn't at the moment but it will be. The way this will be handled is a manual process to determine why there are difference and to determine how it can be resolved. This decision will then be marked in the database to enable the entries to be matched. You say "this reason alone" but that is from a human logic viewpoint - it seems obvious to us that they are the same but what rule would you devise for a computer? Plainly you *could* say that if one district is valid and the other isn't you discard the one that isn't. But that is a dangerous route to take if you are relying on the double keying to detect errors - what if the the valid one is a transcriber "correcting" the district (i.e. *not* typing what they see) and it is the invalid one that is in the register? One also has to take into account the result of not matching. You talk about "garbage entries" but in reality the researcher will see the two entries and can make their own decision. It will be just as obvious at that point that the two entries are one. Take into account that to be useful any rule must be general enough to cover a large number of cases and must produce few (preferably zero) false positives. Barrie -----Original Message----- From: Dick Bond [mailto:dick@bonds.plus.com] Sent: 22 December 2003 17:56 To: FREEBMD-DISCUSS-L@rootsweb.com Subject: Re: Matching Entries Christopher Richards wrote "I noticed two entries that appeared to be the same and wondered why they hadn't been matched. Further information shows that one transcriber had transcribed the district as "Manchester XX" and the other as "Manchester 20". Clearly transcriber 2 had not typed what he saw. Should these two entries be matched?" This is just the kind of example I have been asking about. IF the the ONLY difference between two entries is a variation in the way the SAME district is shown then will this cause there to be duplicate entries on the database? If duplicates arise from this reason alone then something should be done to modify the database creation such that this does not create garbage entries. Transcribers WILL make mistakes even when shouted at to TWYS Dick Bond
On Sun, 21 Dec 2003 21:52:46 -0000, you wrote: >However when it comes to District names I, as a 'fallible human' transcriber, do (or maybe did) not understand why problems might be created if the scan showed 'W.Derby' (i.e. dot but no space) and I typed 'W. Derby' (i.e. space after dot) , or, 'W Derby' (no dot). To me they all represent the same district. I also understood that there was an 'aliasing' process which would consolidate such variations in naming what is in fact the same district. Correct. >I had assumed that 'aliasing' occurred during the creation of the underlying database thus any differing versions of the same district name would be 'corrected' on that database. It does occur then, and from that point onwards we hold District Name (as transcribed) and District ID (a link to the master list of canonical districts) >If I understand Dave correctly, however, the underlying database will hold exactly the format of the district name that is keyed in - and 'Aliasing' is only carried out by the search process. No. The database holds both items of information. It uses the aliased ID for search purposes, and the original data for display. >IF my logic is correct so far ..... > >Then, whilst my entry is the first (only) keying then there is in fact no (serious) problem since all versions of 'W. Derby' will be treated that same and any search will be correct. > >HOWEVER, when there is a second keying of the data and the next transcriber enters (say) 'W.Derby' then this difference will cause a second, unmatched entry onto the database. > >....... > >Thus I ask. Is my analysis correct? Will a variation in just the spelling of the district result in duplicate entries? No. -- Dave Mayall
On Mon, 22 Dec 2003 17:55:38 -0000, you wrote: >Christopher Richards wrote > >"I noticed two entries that appeared to be the same and wondered why they >hadn't been matched. Further information shows that one transcriber had >transcribed the district as "Manchester XX" and the other as "Manchester >20". >Clearly transcriber 2 had not typed what he saw. Should these two entries be >matched?" > >This is just the kind of example I have been asking about. IF the the ONLY difference between two entries is a variation in the way the SAME district is shown then will this cause there to be duplicate entries on the database? And the answer is the same as Barrie gave yesterday... No it won't. Where we have cases like this, we need to know which records are apparently not matching properly, so we can check them and work out exactly what the problem *IS*. Guesswork based upon assumptions about how the process might (but doesn't) work don't move it forward one bit. -- Dave Mayall
Christopher Richards wrote "I noticed two entries that appeared to be the same and wondered why they hadn't been matched. Further information shows that one transcriber had transcribed the district as "Manchester XX" and the other as "Manchester 20". Clearly transcriber 2 had not typed what he saw. Should these two entries be matched?" This is just the kind of example I have been asking about. IF the the ONLY difference between two entries is a variation in the way the SAME district is shown then will this cause there to be duplicate entries on the database? If duplicates arise from this reason alone then something should be done to modify the database creation such that this does not create garbage entries. Transcribers WILL make mistakes even when shouted at to TWYS Dick Bond