On Mon, 27 Oct 2003 17:07:20 -0000, you wrote: >Hmmmm. That's interesting. I will just make the following comments (in >this optional discussion!!) > >1. Surely syndicate leaders must have some idea how many pages have not >been uploaded so far versus how many pages were in that quarter? This could >give a clearer measure of coverage. Well, they know how many should be there. Unfortunately, looking at what is actually there seldom agrees with what the co-ord thinks should be there. >2. Personally, I find the coverage graphs very useful. If I am searching a >period where the graphs are at 100% and I still can't find my event, I would >have assumed that it wasn't there. But now, from what you say, 100% could >mean 90% done and 10% are non identical second keyings of some of the 90%, >with 10% not done at all yet! This is somewhat different!! My event could >be in the 10% not done at all yet. Yes. >3. So are the graphs capped at 100%? (Obviously yes!). I can see that >with some non-identical second keyings, figures could rise above 100%. We >need to distinguish between those years at 100% (which may be 90% done and >10% non identical second keyings) from those at 100% (which are 100% done >plus some x% for non-identical second keyings). The question is how you determine "x". >4. I still feel we need a measure of data quality that reflects true >completeness, degree of second keying, degree of matching first and second >keying, and extent of uncertain characters. Yes, I know we are all busy >volunteers!! Quite apart from finding the resource to do such a thing, we still haven't established *how* we could do it. If somebody can specify *how* we are to derive a more accurate figure from the data available, then we can do something about it. Bear in mind that whatever we do it has to be capable of being produced within 24 hours of each update (otherwise we start getting complaints) >You said "It is not and cannot be an exact science." I say - If we are >making an >exact< copy of the GRO indexes, we have to make it an exact >science!! Those that disagree with me are probably those that want a quick >and dirty copy of the GRO indexes I disagree, and not for that reason. The indexes must be accurate, because that is what we are doing. It is our core purpose. Producing statistics is NOT. Devoting huge amounts of time to producing these statistics would be a distraction from what we should be doing. -- Dave Mayall