I need to extract names (including uncommon names) from blocks of text using Perl. I’ve looked into this module for extracting names, but it only has the top 1000 popular names and surnames in the US dating back to 1990; I need something a bit more comprehensive.
I’ve considered using the Social Security Index to make a database for comparison, but this seems very tedious and processing intensive. Is there a way to pull names from Perl using another method?
Example of text to parse:
LADNIER
Louis Anthony Ladnier, [Louie] age 48, of Mobile, Alabama died at home Friday, November 16, 2012.
Louie was born January 9, 1964 in Mobile, Alabama. He was the son of John E. Ladnier, Sr. and Gloria Bosarge Ladnier. He was a graduate of McGill-Toolen High School and attended University of South Alabama. He was employed up until his medical retirement as Communi-cations Supervisor with the Bayou La Batre Police Department.
He is preceded in death by his father, John. Survived by his mother, Gloria, nephews, Dominic Ladnier and Christian Rubio, whom he loved and help raise as his own sons, sisters, Marj Ladnier and Morgan Gordy [Julian], and brother Eddie Ladnier [Cindy], and nephews, Jamie, Joey, Eddie, Will, Ben and nieces, Anna and Elisabeth.
Memorial service will be held at St. Dominic’s Catholic Church in Mobile on Wednesday at 1pm.
Serenity Funeral Home is in charge of arrangements.
In lieu of flowers, memorials may be sent to St. Dominic School, 4160 Burma Road Mobile, AL 36693, education fund for Christian Rubio and McGill-Toolen High School, 1501 Old Shell Road Mobile, AL 36604, education Fund for Dominic Ladnier.
The family is grateful for all the prayers and support during this time. Louie was a rock and a joy to us all.
There is no sure fire way to do this due to the nature of the English language. You either need lists to (fuzzy)compare with, or will have to settle for significant accuracy penalties.