OK, I give up. Please explain "distinct" records as opposed to "Unique" records. John Fairlie Mail us at ..... john@fairlie.plus.com john.fairlie@blueyonder.co.uk Home page... http://www.fairlie.plus.com -----Original Message----- From: Peter Dauncey [mailto:peter@dauncey54.freeserve.co.uk] Sent: Monday, February 02, 2004 12:37 PM To: FREEBMD-DISCUSS-L@rootsweb.com Subject: Latest update The latest update shows increases of 2,594,684 in the total number of records; 1,122,605 in the number of distinct records; and 1,160,379 in the number of unique records. The Year/Event showing the largest increase in total records is 1875 Births, but it would be misleading to highlight this as the area most likely to contain a missing ancestor. Although total records have increased by 452,909 (from 904,599 to 1,357,508), the distinct records have increased by 61,804 (from 824,721 to 886,525) and unique records have increased by just 32,579 (from 815,309 to 847,888). The bulk of the "new" records are second keying. I have therefore based my analysis on "distinct" records. There are 726,701 more Births. The big increases are for 1907 with 180,147 and 1891 with 121,882 but there are 9 other years with increases over 20K: 1878 (69,939); 1875 (61,804); 1842 (50,092); 1873 (42,956); 1844 (41,035); 1887 (31,564); 1890 (26,017); 1861 (26,004) and 1871 (23,644). There are 86,264 less Marriages than before. The only year with a sizeable increase is 1907 with 43,754. The major reduction is 1849 with -33,965, but there are 4 other years with reductions in excess of 10K records. There are 482,168 more Deaths. The big increases are for 1910 with 180,147 and 1888 with 92,860 but there are 4 other years with increases over 16K: 1852 (64,301); 1886 (41,522); 1847 (35,538) and 1841 (32,060) Happy searching/transcribing Peter Dauncey ============================== Gain access to over two billion names including the new Immigration Collection with an Ancestry.com free trial. Click to learn more. http://www.ancestry.com/rd/redir.asp?targetid=4930&sourceid=1237
----- Original Message ----- From: "John Fairlie" <john.fairlie@blueyonder.co.uk> To: <FREEBMD-DISCUSS-L@rootsweb.com> Sent: Monday, February 02, 2004 5:50 PM Subject: RE: Latest update > OK, I give up. Please explain "distinct" records as opposed to "Unique" > records. :-) We implemented a solution to solve the overcounting that you identified! Consider a page of 40 entries, double keyed, with 3 entries transcribed differently by the transcribers. That would be 80 total records, it would also be 43 unique records, giving an overcount of 3 records to the total, and messing the stats up. We now analyse the alignment of unmatched records, and do an additional count on records which don't actually match, but which (because of their sequence) are obviously different transcriptions of the same entry, and in the distinct records count, onlyu count them once, thus there would be 40 distinct records. This achieves two things; 1) More accurate stats 2) Data that tells us about the degree of mismatch between double keyings (the difference between Unique and distinct is the number of mismatches)