Google & Partners are Digitizing Millions of Old Books Five of the world's largest libraries have joined Google to digitize millions of books and make every sentence searchable. Nothing in today's announcement mentions genealogy books but with millions of out-of-print books being digitized, one has to believe that at least a handful of them will be genealogies or local histories. The project involves libraries at Harvard and Stanford Universities, the University of Michigan at Ann Arbor, and the University of Oxford, as well as the New York Public Library. It could soon turn Google into the single largest holder of digitized published material. In effect, it will become the world's largest digital library and one of the world's largest libraries of any kind. It will also provide researchers and students with an unprecedented tool for finding information. The company will begin by scanning works that are in the public domain, and the full contents of those books will be accessible online through the popular Google search engine. But the company also plans to scan copyrighted books in some of the libraries. The search engine will not return the full texts of those volumes, but will instead provide up to three short excerpts, each consisting of only a few lines of text in which a search term appears. Google officials and librarians hope the excerpts will be sufficient to let researchers determine whether they want to check out or purchase the book. Google will include links to online booksellers and local library catalogues along with search results. The number of volumes that could be scanned is interesting to contemplate: Harvard University: 15 million volumes New York Public Library; 20 million Stanford: more than 7.6 million University of Michigan: 7.8 million Oxford: more than 6.5 million books. Harvard, Stanford, and the New York Public Library have agreed only to pilot projects with the company. Harvard University, for example, has agreed to let Google scan only 40,000 books during the pilot phase of the project. The books will be selected randomly from the five million volumes in the Harvard Depository, an off-site storage facility for seldom-requested books. During the pilot phase of the project, the New York Public Library has agreed to let Google scan more than 10,000 but less than 100,000 public domain books. Oxford will allow Google to scan only books published before 1900 while officials at the University of Michigan have agreed to allow all of their books to be scanned. All of the projects are expected to take years to complete. Susan Wojcicki, director of product management for Google, said that the Google Print project would lead to an increase in book sales because it would show readers what the volumes contain. "For publishers, we believe that this will be beneficial," she said. AND Library and Archives Canada is Scanning Millions of Pages of Documents The following is an excerpt from an interesting article about digitizing old documents in Canada: Library and Archives Canada, which combines the former National Library of Canada and National Archives of Canada, has been especially active, scanning millions of pages of documents a year. It has now put all of the publications, including pamphlets and books, printed in Canada in the 18th and 19th century on-line for the public to access, said Ian Wilson, librarian and archivist of Canada. "We're building this systematically and we're looking right now at the feasibility of other print material for the 20th century," he said. But even if the archive digitizes several million pages a year over 10 years, it will still have only less than half of 1 per cent of the national archives on-line, Wilson added. You can read the entire article on the Globe and Mail web site at http://www.theglobeandmail.com/servlet/ArticleNews/TPStory/LAC/20041215/GOOG LE15/TPEntertainment/TopStories