W
Wondering
I'm struggling to learn Perl, with some degree of success. I have a
question that's a bit more advanced than I am, but I hope someone can
help (thanks in advance to all who read this and biger thanks to
responders).
I'm trying to match name and address records in a large (~300,000
record) database with potential new records to avoid duplicates. Anyone
who has tried this knows that there are problems with exact matching,
especially if no convention has been followed for entering data.
(Consider all the possible variations of "avenue" - "avenue", "av",
"ave", etc., and when you consider drive, boulevard, etc. and all their
possible abbreviations, you begin to get the picture). So, I want to be
able to extract just the numeric characters in a strings so I can do
the matching on those (it's fuzzy, but with other feilds being
considered, too, we can get a fairly high matching rate). Anyone know
how to extract just the numeric charaters?
I'll also accept any other ideas for doing the match.
question that's a bit more advanced than I am, but I hope someone can
help (thanks in advance to all who read this and biger thanks to
responders).
I'm trying to match name and address records in a large (~300,000
record) database with potential new records to avoid duplicates. Anyone
who has tried this knows that there are problems with exact matching,
especially if no convention has been followed for entering data.
(Consider all the possible variations of "avenue" - "avenue", "av",
"ave", etc., and when you consider drive, boulevard, etc. and all their
possible abbreviations, you begin to get the picture). So, I want to be
able to extract just the numeric characters in a strings so I can do
the matching on those (it's fuzzy, but with other feilds being
considered, too, we can get a fairly high matching rate). Anyone know
how to extract just the numeric charaters?
I'll also accept any other ideas for doing the match.