> > Subject: Re: microfische > Resent-Date: Wed, 10 Jun 1998 07:27:19 -0700 (PDT) > Resent-From: GenTips-L@rootsweb.com > Date: Wed, 10 Jun 1998 15:15:45 +0100 > From: "joe.power" <joe.power@which.net> > To: GenTips-L@rootsweb.com > References: <01bd9487$650ba640$cf5cf8c6@default> > > Hello Marcia, > > The idea is that you scan a document and then use an OCR program to recognise > the letters in the PICTURE of the document. The OCR program then stores the > TEXT > as a file that you can word process put into a database, spreadsheet or > whatever > you'd normally do with some text. The catch is of course that none of the OCR > programs get it 100% correct so you have to check it all. > > With regard to fische my idea was to scan a fische (e.g. St. Catherine's > Index) > at very high resolution 1200dpi or better and then run it through an OCR > program > to get data files. The next step would be to get lots of people to do the same > and publish the text on their websites in a Webring. It wouldn't take too long > before the whole thing was out there in cyberspace waiting to be looked at by > people like us. This would remove the need to travel to the locations of the > fische and also the whole Index would be searchable by computer! A by-product > is > that automatic indexing of census results should be achievable. > > Whilst this sounds very good there appear to be problems. The OCR I've tried > on > photocopies of a Census from fische was not too successful and the > magnification > of the scanner may not be enough. I'm still thinking about it though. Anyway > the > World Cup is about to start! > > Cheers > > Joe de la Poer Power > These messages are being resent to me. As to the scanning of microfische to CD; the appliance part and repair industry is already doing this. As possibly many others are too. They are not doing OCR just CDing the illustrations. AS to OCRing the text; the text should be 'inlarged' to about 10 points. A resolution in excess of 300 per inch of this size text is over kill. However a good OCR program is worth the expense. So the service is avalable if one can ferret out the provider. And the resultant images should lend themselves to OCRing. Hugh Broesamle hughbro_3@juno.com