RootsWeb.com Mailing Lists
Total: 1/1
    1. Re: [PABLAIR] Weekly Messenger Offering - 1846
    2. Wayne Webb
    3. Dick, The image is "fuzzy" because the text in the original has bled over the years. We are talking about publications anywhere from 100 to 180 years old. This can be remedied by sharpening the image after scanner acquisition. However, sharpening for archival materials is subjective to each item. In essence, because each item fades at a different rate than the next item, the amounts used to sharpen image one in all likelihood will not apply to the next image. This is generally not a problem when dealing with a photograph. But when digitizing a periodical (books, newspapers, etc.) wherein you are customarily dealing with hundreds of pages this does not apply. It is a question of labor with each page needing individual attention. Yes, the task can be automated, but then you run into what works for this page does not necessarily apply to a page further into the work. The short of it is that for periodicals you do not sharpen. Finally the amount to sharpen an image, photograph or periodical, is subjective to the person's eyes with no one person agreeing to the proper amount to sharpen an image. The image could have been OCRed with no problem but that also raises two items of discussion. To get better OCR results you would like the image to be in color and at least 600 pixels per inch. For an item as large as what I presented the file would have been somewhere around 250 megabytes for an item that was just for show and tell. Not the PDF, though that would have been large as well, but the TIF image I used to create the PDF. I purchase my hard drives out of my meager budget so I decide. At present I have somewhere around 16 drives with another 4 waiting to be used. The short of it is that I have settled on a resolution of 400ppi as a good balance of quality vs. quantity. And before you ask, But don't you have DVD backups?, let me state that yes I do, for select records sets. I will let you count how many DVDs you need for 4+ terabytes. Using one of my German Baptist Brethren newspapers as an example let me say this. If I scanned it at 600ppi for all the 800+ pages I would be buying a hard drive after three or four sets, roughly. And since for failsafe I have all my drives RAID mirrored that means two drives would be utilized. But by settling on 400ppi I am able to get much, much more on one hard drive. I get decent OCR results with the 440ppi resolution so I am satisfied with the quality vs. quantity equation. There are two other reasons I did not OCR the image and then post the text. The resultant posting would have been extremely large is the first reason. The second is that the time required to OCR and then check the results was more than I was willing to spend. If it had been a financial situation or for my own personal research that would have been a different situation. Wayne Webb P.S. In another issue of the "Weekly Messenger" there was a one and a three-quarter page letter written from President James K. Polk to the citizens of the United States. The letter was discussing there wherefores and whereto of the Mexican War and was in, roughly, set with a point size of 4. It is a highly interesting article, at least to me, but there was no way I was going to process a 4,000 or so word article. Again, if for something other than generally dissemination it would have been a different matter. And the PDF file would have been somewhere around 40 megabytes or so with the TIF files likely being over 300 megabytes in size. > -----Original Message----- > > WAYNE > The PDF opened just fine for me. ?The image is a bit fuzzy and since it > has some very small type I can see why you chose not to OCR the thing. > ?Maybe someday the technology will be up to this job ... > Thanks for posting these. > DICK FOLKERTHDallas

    03/30/2010 07:28:37