On Tue, 20 Nov 2012 09:29:43 +0000, Ian Goddard <goddai01@hotmail.co.uk> wrote in soc.genealogy.computing: >Denis Beauregard wrote: >> It is relatively easy to take a document and to print it to a PDF >> file. For example, I can read a text file and print it with >> LibreOffice. >> >> Now, I have a book scanned to a PDF file and I would like to extract >> the images to enter them in an image processor so as to check each >> entry in a list when I integrate the information into a database. >> >> I presume if I can print to a PDF file, then I can print to a JPG >> file as well. > >AFAIK LibreOffice doesn't have a print to JPEG or whatever unless you >actually have (if such a thing exists) a 3rd party application which >provides a pseudo-printer. > >However, on a quick test I found I could open a PDF with LibreOffice, >copy an image & paste it into a paint program. This is with Linux but >presumably should also work on W7. I presume teh many LO versions have all the same features for the same version. W7 does it too ! >However, it wouldn't be my preferred way of doing this on Linux as there >are command line tools to extract images from PDFs. In fact I used them >recently to split a PDF consisting of scans of maps & get a series of >TIFFs of the original individual sheets. > >Scanned books, e.g. from archive.org are often OCRd & you can also >extract the text - complete with all the usual OCR artefacts. I tried this solution and it does more or less the job. I open the PDF file with LO, delete the pages I don't need and I copy pages one by one into the paint software. I can't find how to edit the images directly however. All I can do is to put arrows and I can do that to indicate that I already have the data on a given line. Anyway, I think I can do more or less what I want to do. Thanks for the idea. I didn't know LO could edit PDF files. Denis -- Denis Beauregard - généalogiste émérite (FQSG) Les Français d'Amérique du Nord - www.francogene.com/genealogie--quebec/ French in North America before 1722 - www.francogene.com/quebec--genealogy/ Sur cédérom à 1780 - On CD-ROM to 1780
Denis Beauregard wrote: > It is relatively easy to take a document and to print it to a PDF > file. For example, I can read a text file and print it with > LibreOffice. > > Now, I have a book scanned to a PDF file and I would like to extract > the images to enter them in an image processor so as to check each > entry in a list when I integrate the information into a database. > > I presume if I can print to a PDF file, then I can print to a JPG > file as well. AFAIK LibreOffice doesn't have a print to JPEG or whatever unless you actually have (if such a thing exists) a 3rd party application which provides a pseudo-printer. However, on a quick test I found I could open a PDF with LibreOffice, copy an image & paste it into a paint program. This is with Linux but presumably should also work on W7. However, it wouldn't be my preferred way of doing this on Linux as there are command line tools to extract images from PDFs. In fact I used them recently to split a PDF consisting of scans of maps & get a series of TIFFs of the original individual sheets. Scanned books, e.g. from archive.org are often OCRd & you can also extract the text - complete with all the usual OCR artefacts. -- Ian The Hotmail address is my spam-bin. Real mail address is iang at austonley org uk
On 20 Nov at 1:13, Denis Beauregard <denis.b-at-francogene.com@fr.invalid> wrote: > It is relatively easy to take a document and to print it to a PDF > file. For example, I can read a text file and print it with > LibreOffice. > > Now, I have a book scanned to a PDF file and I would like to extract > the images to enter them in an image processor so as to check each > entry in a list when I integrate the information into a database. > > I presume if I can print to a PDF file, then I can print to a JPG file > as well. Yes. But some programs do or allow this more easily than others. But scanning images and putting them into a PDF files is not the same as integrating the information into a database. You first have to extract the information from the images. OCR, Optical Character Readers, can do this but the accuracy is not 100% and if the book is old with variable print, the accuracy will be even less. -- Tim Powys-Lybbe tim@powys.org for a miscellany of bygones: http://powys.org/
On 11/19/2012 8:45 PM, Denis Beauregard wrote: > On Tue, 20 Nov 2012 01:27:37 +0000, Tim Powys-Lybbe <tim@powys.org> > wrote in soc.genealogy.computing: > >> On 20 Nov at 1:13, Denis Beauregard >> <denis.b-at-francogene.com@fr.invalid> wrote: >> >>> It is relatively easy to take a document and to print it to a PDF >>> file. For example, I can read a text file and print it with >>> LibreOffice. >>> >>> Now, I have a book scanned to a PDF file and I would like to extract >>> the images to enter them in an image processor so as to check each >>> entry in a list when I integrate the information into a database. >>> >>> I presume if I can print to a PDF file, then I can print to a JPG file >>> as well. >> >> Yes. But some programs do or allow this more easily than others. >> >> But scanning images and putting them into a PDF files is not the same as >> integrating the information into a database. You first have to extract >> the information from the images. OCR, Optical Character Readers, can do >> this but the accuracy is not 100% and if the book is old with variable >> print, the accuracy will be even less. > > My immediate need was for a series of records where some of them are > new, i.e. a list of burial records and for some of them, the church > record is lost. So in a first step, I check out the record from my > reference database, and then I add to my database the remaining > records. I don't want to scan them. > > I already did this when I was using Acrobat 5 which is not compatible > with the more record PDF format so I was looking for something to > do the same thing with Windows 7 and the new format. > > > Denis > PDFXchange Viewer allows you to export to an image file. You can export to all the common image file types. It is free, available to download at http://www.tracker-software.com/product/pdf-xchange-viewer. -- Gene Young Researching Young, Harer, Cox & Sallada With Legacy Family Tree http://myyoungs.atspace.com/index.htm
On Tue, 20 Nov 2012 01:27:37 +0000, Tim Powys-Lybbe <tim@powys.org> wrote in soc.genealogy.computing: >On 20 Nov at 1:13, Denis Beauregard ><denis.b-at-francogene.com@fr.invalid> wrote: > >> It is relatively easy to take a document and to print it to a PDF >> file. For example, I can read a text file and print it with >> LibreOffice. >> >> Now, I have a book scanned to a PDF file and I would like to extract >> the images to enter them in an image processor so as to check each >> entry in a list when I integrate the information into a database. >> >> I presume if I can print to a PDF file, then I can print to a JPG file >> as well. > >Yes. But some programs do or allow this more easily than others. > >But scanning images and putting them into a PDF files is not the same as >integrating the information into a database. You first have to extract >the information from the images. OCR, Optical Character Readers, can do >this but the accuracy is not 100% and if the book is old with variable >print, the accuracy will be even less. My immediate need was for a series of records where some of them are new, i.e. a list of burial records and for some of them, the church record is lost. So in a first step, I check out the record from my reference database, and then I add to my database the remaining records. I don't want to scan them. I already did this when I was using Acrobat 5 which is not compatible with the more record PDF format so I was looking for something to do the same thing with Windows 7 and the new format. Denis -- Denis Beauregard - généalogiste émérite (FQSG) Les Français d'Amérique du Nord - www.francogene.com/genealogie--quebec/ French in North America before 1722 - www.francogene.com/quebec--genealogy/ Sur cédérom à 1780 - On CD-ROM to 1780
It is relatively easy to take a document and to print it to a PDF file. For example, I can read a text file and print it with LibreOffice. Now, I have a book scanned to a PDF file and I would like to extract the images to enter them in an image processor so as to check each entry in a list when I integrate the information into a database. I presume if I can print to a PDF file, then I can print to a JPG file as well. Denis -- Denis Beauregard - généalogiste émérite (FQSG) Les Français d'Amérique du Nord - www.francogene.com/genealogie--quebec/ French in North America before 1722 - www.francogene.com/quebec--genealogy/ Sur cédérom à 1780 - On CD-ROM to 1780
On 12 Nov at 23:40, Ian Goddard <goddai01@hotmail.co.uk> wrote: > I was recently shown an example page from a tithe book. As this is > possibly going to become a collaborative transcription project I've > been giving some thought to the sort of data structure needed to hold > the data. I've got some ideas in the back of my head but although it > looks simple it presents some interesting problems, especially if a > relational solution and capability of indexing personal names are > objectives. > > The pages are ruled vertically into a number of columns. These are: > > Landowner. To be thought of as a single entity but names are not > necessarily unique and a "Landowner" may not even be a single person > or even a natural person, e.g. one "Landowner" is a named person plus > the executors of another. So a Landowner is to be structured as a > list of sub-entities but even then the sub-entities aren't all of one > type. > > Occupier. Again this is to be thought of as a single entity but an > "Occupier" can consist of multiple non-unique names. In one instance > there is a name followed by "and another". Although there were none > in the example it wouldn't surprise me to see executors listed. The > structural problems are similar to those of "Landowner" > > Plan reference. A single number. For each such number there are > columns for: Name and Description which may describe one or more > buildings; e.g "2 Houses" or "House, outbuildings and garden"; or > land. State of cultivation - land only Quantities or statute measure > Greater Tithe Lesser Tithe > > Several plan items may be listed against an "Occupier" and apart from > the individual quantities & tithes for each item these are summed > against the "Occupier". > > Neither the pages nor any other items are numbered except the plan > entries. > > On the face of it this is a basic master/detail relationship extending > to two levels of detail: > > One Landowner to one or more Occupier One Occupier to one or more Plan > item > > However as both Landowner and Occupier are themselves potentially > identified by multiple names or descriptions there is no existing > content above Plan item which can be used as keys to hang this > together. > It would be possible to create artificial keys such as > auto-incrementing numbers or UUIDs (preferable for a collaborative > project) but, at least for a relational solution, this results in the > schema including two tables with no content other then the artificial > key. > Unique keys are always useful. I can only see one here, that of the plot of land. What I don't know is whether the beneficiary of the tithe is unique? Are Greater and Lesser tithes names for different beneficiaries? Or are there other categories of tithe recipient? Sounds to me that you need several sheets of paper, each charting a different entity. -- Tim Powys-Lybbe tim@powys.org for a miscellany of bygones: http://powys.org/
I was recently shown an example page from a tithe book. As this is possibly going to become a collaborative transcription project I've been giving some thought to the sort of data structure needed to hold the data. I've got some ideas in the back of my head but although it looks simple it presents some interesting problems, especially if a relational solution and capability of indexing personal names are objectives. The pages are ruled vertically into a number of columns. These are: Landowner. To be thought of as a single entity but names are not necessarily unique and a "Landowner" may not even be a single person or even a natural person, e.g. one "Landowner" is a named person plus the executors of another. So a Landowner is to be structured as a list of sub-entities but even then the sub-entities aren't all of one type. Occupier. Again this is to be thought of as a single entity but an "Occupier" can consist of multiple non-unique names. In one instance there is a name followed by "and another". Although there were none in the example it wouldn't surprise me to see executors listed. The structural problems are similar to those of "Landowner" Plan reference. A single number. For each such number there are columns for: Name and Description which may describe one or more buildings; e.g "2 Houses" or "House, outbuildings and garden"; or land. State of cultivation - land only Quantities or statute measure Greater Tithe Lesser Tithe Several plan items may be listed against an "Occupier" and apart from the individual quantities & tithes for each item these are summed against the "Occupier". Neither the pages nor any other items are numbered except the plan entries. On the face of it this is a basic master/detail relationship extending to two levels of detail: One Landowner to one or more Occupier One Occupier to one or more Plan item However as both Landowner and Occupier are themselves potentially identified by multiple names or descriptions there is no existing content above Plan item which can be used as keys to hang this together. It would be possible to create artificial keys such as auto-incrementing numbers or UUIDs (preferable for a collaborative project) but, at least for a relational solution, this results in the schema including two tables with no content other then the artificial key. -- Ian The Hotmail address is my spam-bin. Real mail address is iang at austonley org uk
On Fri, 09 Nov 2012 11:39:01 +0000, Ian Goddard <goddai01@hotmail.co.uk> wrote: >Steve Hayes wrote: >> On Thu, 8 Nov 2012 11:38:47 -0800 (PST), pblair<pblair@pcug.org.au> wrote: >> >>> Hi Steve >>> >>> As I noted earlier, the call from the FTM menu seems to be hard wired to one particular print driver-the one that came with the original package. And that's the one giving problems. >> >> What happens if you buy a new printer? >> >> > >As far as I can make out this is a pseudo-printer driver to generate >PDFs. I suppose one alternative might be to try an alternative package >& rename the executable to that of the original, always assuming Win8 >will allow the user to do that. Is there any reason why one could not use this: http://fineprint.com/ or this: http://pdfpro10.com/o/creator/ or this: http://www.pdfforge.org/pdfcreator and I'm sure there must be lots of others. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
On Fri, 09 Nov 2012 11:39:01 +0000, Ian Goddard <goddai01@hotmail.co.uk> wrote: >Steve Hayes wrote: >> On Thu, 8 Nov 2012 11:38:47 -0800 (PST), pblair<pblair@pcug.org.au> wrote: >> >>> Hi Steve >>> >>> As I noted earlier, the call from the FTM menu seems to be hard wired to one particular print driver-the one that came with the original package. And that's the one giving problems. >> >> What happens if you buy a new printer? >> >> > >As far as I can make out this is a pseudo-printer driver to generate >PDFs. I suppose one alternative might be to try an alternative package >& rename the executable to that of the original, always assuming Win8 >will allow the user to do that. I have a Samsung printer, which seems to work with most Windows programs. It's a bit of a mission to get it to work with DOS programs, though. It replaced a Konica-Minolta printer, but when I changed I did not have to get an alternative package and rename any executables, so I'm finding it difficult to understand what the problem is. I used to have FTM 2006 -- it came on one of those discs that come with magazines, and I tried it out, didn't like it and removed it. Perhaps I could find the original disc and reinstall it, and see if it works with one of the many PDF printers available. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
Steve Hayes wrote: > On Thu, 8 Nov 2012 11:38:47 -0800 (PST), pblair<pblair@pcug.org.au> wrote: > >> Hi Steve >> >> As I noted earlier, the call from the FTM menu seems to be hard wired to one particular print driver-the one that came with the original package. And that's the one giving problems. > > What happens if you buy a new printer? > > As far as I can make out this is a pseudo-printer driver to generate PDFs. I suppose one alternative might be to try an alternative package & rename the executable to that of the original, always assuming Win8 will allow the user to do that. -- Ian The Hotmail address is my spam-bin. Real mail address is iang at austonley org uk
On Thu, 8 Nov 2012 11:38:47 -0800 (PST), pblair <pblair@pcug.org.au> wrote: >Hi Steve > >As I noted earlier, the call from the FTM menu seems to be hard wired to one particular print driver-the one that came with the original package. And that's the one giving problems. What happens if you buy a new printer? -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
Hi Steve As I noted earlier, the call from the FTM menu seems to be hard wired to one particular print driver-the one that came with the original package. And that's the one giving problems. I've tried later versions from the company that provided the original one, plus a few others that are available, as you suggest. But there has been no joy. I'll manage without it! :-( Paul
On Wed, 7 Nov 2012 19:59:27 -0800 (PST), pblair <pblair@pcug.org.au> wrote: >Couldn't agree more. > >I have tried FTM 2008, but don't trust it. Legacy badly needs a rebuild, and Access is not always as reliable as I would hope for. > >I've managed to defeat the FTM 2005 popup that tells me the driver can't be installed. So I'll roll on without PDF ability, but the rest is fine. Can't you use something like PDFCreator? I use PDFfactory, which has a free and paid-for version, and I like it because it lets you combine numerous documents. I just used it this morning to send domeone two pedigree charts and five family group sheets from Legacy, all in one document. And if you leave it open, you can add stuff from word processors and other programs as well. -- Steve Hayes from Tshwane, South Africa Blog: http://khanya.wordpress.com E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
pblair wrote: > And I'll bet you threw the wrapper away.. :-) > > Not too many software choices left....none that I would consider. One of the reasons I cling to PAF is that it's familiar. I already know what its many quirks are. I started using it as a DOS program in v 2.0; I've stuck to v5 which is current. It doesn't do a lot I'd like it to do, and it does a lot I don't need it to do, but ... the learning curve is a flat line, and in my life right now, that's a priority. (g) Better the devil you know, and all that. Cheryl
Couldn't agree more. I have tried FTM 2008, but don't trust it. Legacy badly needs a rebuild, and Access is not always as reliable as I would hope for. I've managed to defeat the FTM 2005 popup that tells me the driver can't be installed. So I'll roll on without PDF ability, but the rest is fine. Paul
pblair wrote: > Looking at http://en.wikipedia.org/wiki/Ancestry.com: > I've got 2005 > I think v16 (2006) was the last to have All-In-One (2008 was a major rewrite, I think), but I could have that wrong. > So................2006 for you? > (It's all too complicated... :-( ) Lord no, not 2006! 2000 maybe? 2002 maybe? It usually makes me break out in hives when I use it, but I had correspondents who couldn't manage to create a GED and sent me their whole ftm database, so I had to have it around... Paul Burchfield offered me my money back, but I wouldn't take it. Figured I'd have to surrender my gritching rights if I did. (g) Cheryl
And I'll bet you threw the wrapper away.. :-) Not too many software choices left....none that I would consider. Paul
(G) Yours is newer than mine. I /think/ mine's the v before the last one that gave the all-in-one report. (g) Cheryl pblair wrote: > Hi Cheryl > > I'm using 2005, not the new version. > > I wouldn't mind upgrading, but the new(er) versions of FTM seem to have problems.
Looking at http://en.wikipedia.org/wiki/Ancestry.com: I've got 2005 I think v16 (2006) was the last to have All-In-One (2008 was a major rewrite, I think), but I could have that wrong. So................2006 for you? (It's all too complicated... :-( )