RootsWeb.com Mailing Lists
Total: 1/1
    1. RE: Latest update
    2. Archer Barrie
    3. Sorry, Dave, you have the example wrong! The results would be: 80 total 43 distinct 40 unique Your juxtaposition of the meaning of 'distinct' and 'unique' is probably more logical, but 'distinct' was being used already in the current sense. Barrie > -----Original Message----- > From: Dave Mayall [mailto:dave@research-group.co.uk] > Sent: 03 February 2004 08:35 > To: FREEBMD-DISCUSS-L@rootsweb.com > Subject: Re: Latest update > > > ----- Original Message ----- > From: "John Fairlie" <john.fairlie@blueyonder.co.uk> > To: <FREEBMD-DISCUSS-L@rootsweb.com> > Sent: Monday, February 02, 2004 5:50 PM > Subject: RE: Latest update > > > > OK, I give up. Please explain "distinct" records as opposed to > > "Unique" records. > > :-) > > We implemented a solution to solve the overcounting that you > identified! > > Consider a page of 40 entries, double keyed, with 3 entries > transcribed differently by the transcribers. > > That would be 80 total records, it would also be 43 unique > records, giving an overcount of 3 records to the total, and > messing the stats up. > > We now analyse the alignment of unmatched records, and do an > additional count on records which don't actually match, but > which (because of their > sequence) are obviously different transcriptions of the same > entry, and in the distinct records count, onlyu count them > once, thus there would be 40 distinct records. > > This achieves two things; > 1) More accurate stats > 2) Data that tells us about the degree of mismatch between > double keyings (the difference between Unique and distinct is > the number of mismatches) > > ______________________________ >

    02/05/2004 02:14:11