Note: The Rootsweb Mailing Lists will be shut down on April 6, 2023. (More info)
RootsWeb.com Mailing Lists
Total: 4/4
    1. Re: [SFHG] British Newspaper Archive
    2. Phil Vaughan
    3. The mangled text that you referred to, Mike, is the result of running the photo or scanned copy of the original document through an Optical Character Recognition ("OCR") program. When the original text is small or faded or the characters are too close together, OCR errors occur. But OCR programs have improved enormously in recent years, making them fairly usable for this sort of task. http://newspaperarchive.com is a similar site, although I have to say that the search engine leaves much to be desired, at the moment. Phil Vaughan Canada -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Mike Snatt Sent: October-11-12 4:39 PM To: Russell Tuffery; [email protected] Subject: Re: [SFHG] British Newspaper Archive Russell - I had a look at this archive, and was astonished to hit gold at the first attempt - the Snatt family has a highwayman in its Tree! I discovered that merely from searching on our surname and reading the extracts which appeared - I didn't need to buy any full articles. I think the extracts are intended to be the headlines, which didn't exist in old newspapers, plus the first line of the article, and the scanner often picks up the line just before the beginning of the article instead. So the heading for the entry about the death of Spencer Snatt is an advertisement for cocoa! Another drawback is that the extracts are computer-generated (presumably there's a technical name for it), and the computer seems to makes things up if it can't figure out the actual words, so for example, I got a lot of items covering a mine disaster in Scotland, where it had transcribed 'shaft' as 'snatt'. It looks as if the project is in its infancy at present - only a few newspapers have been transcribed. So, to answer your question - it's undoubtedly useful, but whether it gives value for money will be a matter of personal opinion. Mike Snatt ----- Original Message ----- From: "Russell Tuffery" <[email protected]> To: <[email protected]> Sent: Sunday, October 07, 2012 1:56 AM Subject: [SFHG] British Newspaper Archive http://www.britishnewspaperarchive.co.uk Interested to know how useful, members find this archive. I have used the free paperspast.natlib.govt.nz and trove.nla.gov.au and have had a much greater success rate. Russell Tuffery Auckland New Zealand ------------------------------- To unsubscribe from the list, please send an email to [email protected] with the word 'unsubscribe' without the quotes in the subject and the body of the message ------------------------------- To unsubscribe from the list, please send an email to [email protected] with the word 'unsubscribe' without the quotes in the subject and the body of the message

    10/11/2012 11:06:43
    1. Re: [SFHG] British Newspaper Archive
    2. Mike Snatt
    3. Thanks Phil. The degree of manglement of the text is severe! I would have thought even a computer could do better. It's not just the 18th Century papers which suffer, but also later ones, where legibility is greatly improved. However, this is such a useful resource that anything is better than nothing, and certainly cheaper than a trip to Colindale, even for me in Sussex. Mike S ----- Original Message ----- From: "Phil Vaughan" <[email protected]> To: "'Mike Snatt'" <[email protected]>; "'Russell Tuffery'" <[email protected]>; <[email protected]> Sent: Thursday, October 11, 2012 10:06 PM Subject: RE: [SFHG] British Newspaper Archive > The mangled text that you referred to, Mike, is the result of running the > photo or scanned copy of the original document through an Optical > Character > Recognition ("OCR") program. When the original text is small or faded or > the > characters are too close together, OCR errors occur. But OCR programs > have > improved enormously in recent years, making them fairly usable for this > sort > of task. > > http://newspaperarchive.com is a similar site, although I have to say that > the search engine leaves much to be desired, at the moment. > > Phil Vaughan > Canada > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On > Behalf > Of Mike Snatt > Sent: October-11-12 4:39 PM > To: Russell Tuffery; [email protected] > Subject: Re: [SFHG] British Newspaper Archive > > Russell - > > I had a look at this archive, and was astonished to hit gold at the first > attempt - the Snatt family has a highwayman in its Tree! I discovered > that > merely from searching on our surname and reading the extracts which > appeared - I didn't need to buy any full articles. I think the extracts > are > > intended to be the headlines, which didn't exist in old newspapers, plus > the > > first line of the article, and the scanner often picks up the line just > before the beginning of the article instead. So the heading for the entry > about the death of Spencer Snatt is an advertisement for cocoa! > > Another drawback is that the extracts are computer-generated (presumably > there's a technical name for it), and the computer seems to makes things > up > if it can't figure out the actual words, so for example, I got a lot of > items covering a mine disaster in Scotland, where it had transcribed > 'shaft' > > as 'snatt'. > > It looks as if the project is in its infancy at present - only a few > newspapers have been transcribed. > > So, to answer your question - it's undoubtedly useful, but whether it > gives > value for money will be a matter of personal opinion. > > Mike Snatt > > > > ----- Original Message ----- > From: "Russell Tuffery" <[email protected]> > To: <[email protected]> > Sent: Sunday, October 07, 2012 1:56 AM > Subject: [SFHG] British Newspaper Archive > > > > > http://www.britishnewspaperarchive.co.uk > > Interested to know how useful, members find this archive. > > I have used the free paperspast.natlib.govt.nz and trove.nla.gov.au and > have > > had a much greater success rate. > > Russell Tuffery > Auckland New Zealand > > ------------------------------- > To unsubscribe from the list, please send an email to > [email protected] with the word 'unsubscribe' without the quotes > in > the subject and the body of the message > > > ------------------------------- > To unsubscribe from the list, please send an email to > [email protected] with the word 'unsubscribe' without the quotes > in > the subject and the body of the message > >

    10/11/2012 08:47:29
    1. Re: [SFHG] British Newspaper Archive
    2. Phil Vaughan
    3. Our eyes and brains are much better at character recognition (thank goodness!) than OCR software. This is even more true when the original text is in a newspaper, where space on the page is valuable and the typesetters - especially in the past - pushed the letters as close together as possible in order to cram more text onto their page. The inked letters often touched each other, making two consecutive character appear to be one ... and, of course, portions of a letter may have faded with age or even failed to print. Hence, OCR programs find older newspaper text particularly manglable. (There you go, Tony!) Phil -----Original Message----- From: Mike Snatt [mailto:[email protected]] Sent: October-11-12 9:47 PM To: Phil Vaughan; 'Russell Tuffery'; [email protected] Subject: Re: [SFHG] British Newspaper Archive Thanks Phil. The degree of manglement of the text is severe! I would have thought even a computer could do better. It's not just the 18th Century papers which suffer, but also later ones, where legibility is greatly improved. However, this is such a useful resource that anything is better than nothing, and certainly cheaper than a trip to Colindale, even for me in Sussex. Mike S ----- Original Message ----- From: "Phil Vaughan" <[email protected]> To: "'Mike Snatt'" <[email protected]>; "'Russell Tuffery'" <[email protected]>; <[email protected]> Sent: Thursday, October 11, 2012 10:06 PM Subject: RE: [SFHG] British Newspaper Archive > The mangled text that you referred to, Mike, is the result of running the > photo or scanned copy of the original document through an Optical > Character > Recognition ("OCR") program. When the original text is small or faded or > the > characters are too close together, OCR errors occur. But OCR programs > have > improved enormously in recent years, making them fairly usable for this > sort > of task. > > http://newspaperarchive.com is a similar site, although I have to say that > the search engine leaves much to be desired, at the moment. > > Phil Vaughan > Canada > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On > Behalf > Of Mike Snatt > Sent: October-11-12 4:39 PM > To: Russell Tuffery; [email protected] > Subject: Re: [SFHG] British Newspaper Archive > > Russell - > > I had a look at this archive, and was astonished to hit gold at the first > attempt - the Snatt family has a highwayman in its Tree! I discovered > that > merely from searching on our surname and reading the extracts which > appeared - I didn't need to buy any full articles. I think the extracts > are > > intended to be the headlines, which didn't exist in old newspapers, plus > the > > first line of the article, and the scanner often picks up the line just > before the beginning of the article instead. So the heading for the entry > about the death of Spencer Snatt is an advertisement for cocoa! > > Another drawback is that the extracts are computer-generated (presumably > there's a technical name for it), and the computer seems to makes things > up > if it can't figure out the actual words, so for example, I got a lot of > items covering a mine disaster in Scotland, where it had transcribed > 'shaft' > > as 'snatt'. > > It looks as if the project is in its infancy at present - only a few > newspapers have been transcribed. > > So, to answer your question - it's undoubtedly useful, but whether it > gives > value for money will be a matter of personal opinion. > > Mike Snatt > > > > ----- Original Message ----- > From: "Russell Tuffery" <[email protected]> > To: <[email protected]> > Sent: Sunday, October 07, 2012 1:56 AM > Subject: [SFHG] British Newspaper Archive > > > > > http://www.britishnewspaperarchive.co.uk > > Interested to know how useful, members find this archive. > > I have used the free paperspast.natlib.govt.nz and trove.nla.gov.au and > have > > had a much greater success rate. > > Russell Tuffery > Auckland New Zealand > > ------------------------------- > To unsubscribe from the list, please send an email to > [email protected] with the word 'unsubscribe' without the quotes > in > the subject and the body of the message > > > ------------------------------- > To unsubscribe from the list, please send an email to > [email protected] with the word 'unsubscribe' without the quotes > in > the subject and the body of the message > >

    10/12/2012 01:38:59
    1. Re: [SFHG] British Newspaper Archive
    2. Cordelia Hull
    3. I tried searching this data base for one of my ancestral names which is COLLISON. I found an amazing number of railway COLLISONS and barge COLLISONS and train COLLISONS, lots of injuries and damage. But not one ancestor for me !!! :-( Cordelia 14526 On 12 October 2012 22:38, Phil Vaughan <[email protected]> wrote: > Our eyes and brains are much better at character recognition (thank > goodness!) than OCR software. This is even more true when the original text > is in a newspaper, where space on the page is valuable and the typesetters - > especially in the past - pushed the letters as close together as possible in > order to cram more text onto their page. The inked letters often touched > each other, making two consecutive character appear to be one ... and, of > course, portions of a letter may have faded with age or even failed to > print. Hence, OCR programs find older newspaper text particularly > manglable. (There you go, Tony!) > > Phil > > -----Original Message----- > From: Mike Snatt [mailto:mike.snat[email protected]] > Sent: October-11-12 9:47 PM > To: Phil Vaughan; 'Russell Tuffery'; [email protected] > Subject: Re: [SFHG] British Newspaper Archive > > Thanks Phil. > > The degree of manglement of the text is severe! I would have thought even a > computer could do better. It's not just the 18th Century papers which > suffer, but also later ones, where legibility is greatly improved. However, > > this is such a useful resource that anything is better than nothing, and > certainly cheaper than a trip to Colindale, even for me in Sussex. > > Mike S > > > ----- Original Message ----- > From: "Phil Vaughan" <[email protected]> > To: "'Mike Snatt'" <[email protected]>; "'Russell Tuffery'" > <[email protected]>; <[email protected]> > Sent: Thursday, October 11, 2012 10:06 PM > Subject: RE: [SFHG] British Newspaper Archive > > >> The mangled text that you referred to, Mike, is the result of running the >> photo or scanned copy of the original document through an Optical >> Character >> Recognition ("OCR") program. When the original text is small or faded or >> the >> characters are too close together, OCR errors occur. But OCR programs >> have >> improved enormously in recent years, making them fairly usable for this >> sort >> of task. >> >> http://newspaperarchive.com is a similar site, although I have to say that >> the search engine leaves much to be desired, at the moment. >> >> Phil Vaughan >> Canada >> >> -----Original Message----- >> From: [email protected] [mailto:[email protected]] On >> Behalf >> Of Mike Snatt >> Sent: October-11-12 4:39 PM >> To: Russell Tuffery; [email protected] >> Subject: Re: [SFHG] British Newspaper Archive >> >> Russell - >> >> I had a look at this archive, and was astonished to hit gold at the first >> attempt - the Snatt family has a highwayman in its Tree! I discovered >> that >> merely from searching on our surname and reading the extracts which >> appeared - I didn't need to buy any full articles. I think the extracts >> are >> >> intended to be the headlines, which didn't exist in old newspapers, plus >> the >> >> first line of the article, and the scanner often picks up the line just >> before the beginning of the article instead. So the heading for the entry >> about the death of Spencer Snatt is an advertisement for cocoa! >> >> Another drawback is that the extracts are computer-generated (presumably >> there's a technical name for it), and the computer seems to makes things >> up >> if it can't figure out the actual words, so for example, I got a lot of >> items covering a mine disaster in Scotland, where it had transcribed >> 'shaft' >> >> as 'snatt'. >> >> It looks as if the project is in its infancy at present - only a few >> newspapers have been transcribed. >> >> So, to answer your question - it's undoubtedly useful, but whether it >> gives >> value for money will be a matter of personal opinion. >> >> Mike Snatt >> >> >> >> ----- Original Message ----- >> From: "Russell Tuffery" <[email protected]> >> To: <[email protected]> >> Sent: Sunday, October 07, 2012 1:56 AM >> Subject: [SFHG] British Newspaper Archive >> >> >> >> >> http://www.britishnewspaperarchive.co.uk >> >> Interested to know how useful, members find this archive. >> >> I have used the free paperspast.natlib.govt.nz and trove.nla.gov.au and >> have >> >> had a much greater success rate. >> >> Russell Tuffery >> Auckland New Zealand >> >> ------------------------------- >> To unsubscribe from the list, please send an email to >> [email protected] with the word 'unsubscribe' without the quotes >> in >> the subject and the body of the message >> >> >> ------------------------------- >> To unsubscribe from the list, please send an email to >> [email protected] with the word 'unsubscribe' without the quotes >> in >> the subject and the body of the message >> >> > > > > > ------------------------------- > To unsubscribe from the list, please send an email to [email protected] with the word 'unsubscribe' without the quotes in the subject and the body of the message

    10/13/2012 03:53:57