It is relatively easy to take a document and to print it to a PDF file. For example, I can read a text file and print it with LibreOffice. Now, I have a book scanned to a PDF file and I would like to extract the images to enter them in an image processor so as to check each entry in a list when I integrate the information into a database. I presume if I can print to a PDF file, then I can print to a JPG file as well. Denis -- Denis Beauregard - généalogiste émérite (FQSG) Les Français d'Amérique du Nord - www.francogene.com/genealogie--quebec/ French in North America before 1722 - www.francogene.com/quebec--genealogy/ Sur cédérom à 1780 - On CD-ROM to 1780
On 20 Nov at 1:13, Denis Beauregard <denis.b-at-francogene.com@fr.invalid> wrote: > It is relatively easy to take a document and to print it to a PDF > file. For example, I can read a text file and print it with > LibreOffice. > > Now, I have a book scanned to a PDF file and I would like to extract > the images to enter them in an image processor so as to check each > entry in a list when I integrate the information into a database. > > I presume if I can print to a PDF file, then I can print to a JPG file > as well. Yes. But some programs do or allow this more easily than others. But scanning images and putting them into a PDF files is not the same as integrating the information into a database. You first have to extract the information from the images. OCR, Optical Character Readers, can do this but the accuracy is not 100% and if the book is old with variable print, the accuracy will be even less. -- Tim Powys-Lybbe tim@powys.org for a miscellany of bygones: http://powys.org/
On Tue, 20 Nov 2012 01:27:37 +0000, Tim Powys-Lybbe <tim@powys.org> wrote in soc.genealogy.computing: >On 20 Nov at 1:13, Denis Beauregard ><denis.b-at-francogene.com@fr.invalid> wrote: > >> It is relatively easy to take a document and to print it to a PDF >> file. For example, I can read a text file and print it with >> LibreOffice. >> >> Now, I have a book scanned to a PDF file and I would like to extract >> the images to enter them in an image processor so as to check each >> entry in a list when I integrate the information into a database. >> >> I presume if I can print to a PDF file, then I can print to a JPG file >> as well. > >Yes. But some programs do or allow this more easily than others. > >But scanning images and putting them into a PDF files is not the same as >integrating the information into a database. You first have to extract >the information from the images. OCR, Optical Character Readers, can do >this but the accuracy is not 100% and if the book is old with variable >print, the accuracy will be even less. My immediate need was for a series of records where some of them are new, i.e. a list of burial records and for some of them, the church record is lost. So in a first step, I check out the record from my reference database, and then I add to my database the remaining records. I don't want to scan them. I already did this when I was using Acrobat 5 which is not compatible with the more record PDF format so I was looking for something to do the same thing with Windows 7 and the new format. Denis -- Denis Beauregard - généalogiste émérite (FQSG) Les Français d'Amérique du Nord - www.francogene.com/genealogie--quebec/ French in North America before 1722 - www.francogene.com/quebec--genealogy/ Sur cédérom à 1780 - On CD-ROM to 1780
Denis Beauregard wrote: > It is relatively easy to take a document and to print it to a PDF > file. For example, I can read a text file and print it with > LibreOffice. There are cmds in linux that will: - make a jpg of every single page OR - extract images and construct eg a html with text around it. eg pdftohtml -c -p -dev jpeg but, If you have ImageMagick it's easy as convert file.pdf file.jpg http://www.linuxquestions.org/questions/linux-software-2/how-to-convert-pdf- to-jpg-910611/ Online pdf to jpg: the cloud works for free: http://pdf2jpg.net/ -- -- What's on Shortwave guide: choose an hour, go! http://shortwave.tk 700+ Radio Stations on SW http://swstations.tk 300+ languages on SW http://radiolanguages.tk