This dataset takes genealogies in out-of-copyright books and merges them together in one dataset. Right now, there are approximately 15,000 names in the dataset http://orcuttfamily.topcities.com/newengland/surnames.html