Regexp to split name?

Discussion in 'Ruby' started by Alex MacCaw, Jun 23, 2007.

  1. Alex MacCaw

    Alex MacCaw Guest

    Does anyone have an example of splitting a name into first and last
    names? Or is just a case of doing string.split(' ')?

    --
    Posted via http://www.ruby-forum.com/.
     
    Alex MacCaw, Jun 23, 2007
    #1
    1. Advertising

  2. Alex MacCaw

    darren kirby Guest

    quoth the Alex MacCaw:
    > Does anyone have an example of splitting a name into first and last
    > names? Or is just a case of doing string.split(' ')?


    I'd say a regexp is overkill here.

    irb(main):001:0> name = "Alex MacCaw"
    => "Alex MacCaw"
    irb(main):002:0> first, last = name.split
    => ["Alex", "MacCaw"]
    irb(main):003:0> first
    => "Alex"
    irb(main):004:0> last
    => "MacCaw"

    Note that you will have to do more work to accommodate middle names and
    titles, ie: Mr, Mrs, Dr etc...

    -d
    --
    darren kirby :: Part of the problem since 1976 :: http://badcomputer.org
    "...the number of UNIX installations has grown to 10, with more expected..."
    - Dennis Ritchie and Ken Thompson, June 1972
     
    darren kirby, Jun 23, 2007
    #2
    1. Advertising

  3. Alex MacCaw

    Guest

    Hi --

    On Sun, 24 Jun 2007, darren kirby wrote:

    > quoth the Alex MacCaw:
    >> Does anyone have an example of splitting a name into first and last
    >> names? Or is just a case of doing string.split(' ')?

    >
    > I'd say a regexp is overkill here.
    >
    > irb(main):001:0> name = "Alex MacCaw"
    > => "Alex MacCaw"
    > irb(main):002:0> first, last = name.split
    > => ["Alex", "MacCaw"]
    > irb(main):003:0> first
    > => "Alex"
    > irb(main):004:0> last
    > => "MacCaw"
    >
    > Note that you will have to do more work to accommodate middle names and
    > titles, ie: Mr, Mrs, Dr etc...


    And also last names with spaces in them (von Trapp, Vaughn Williams,
    etc.).


    David

    --
    * Books:
    RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242)
    RUBY FOR RAILS (http://www.manning.com/black)
    * Ruby/Rails training
    & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)
     
    , Jun 23, 2007
    #3
  4. Alex MacCaw

    Alex Young Guest

    wrote:
    > Hi --
    >
    > On Sun, 24 Jun 2007, darren kirby wrote:
    >
    >> quoth the Alex MacCaw:
    >>> Does anyone have an example of splitting a name into first and last
    >>> names? Or is just a case of doing string.split(' ')?

    >>
    >> I'd say a regexp is overkill here.
    >>
    >> irb(main):001:0> name = "Alex MacCaw"
    >> => "Alex MacCaw"
    >> irb(main):002:0> first, last = name.split
    >> => ["Alex", "MacCaw"]
    >> irb(main):003:0> first
    >> => "Alex"
    >> irb(main):004:0> last
    >> => "MacCaw"
    >>
    >> Note that you will have to do more work to accommodate middle names and
    >> titles, ie: Mr, Mrs, Dr etc...

    >
    > And also last names with spaces in them (von Trapp, Vaughn Williams,
    > etc.).
    >

    And titles with spaces in them (The Honourable, His Excellency, etc...).

    --
    Alex
     
    Alex Young, Jun 25, 2007
    #4
  5. On 6/25/07, Alex Young <> wrote:
    > wrote:
    > > Hi --
    > >
    > > On Sun, 24 Jun 2007, darren kirby wrote:
    > >
    > >> quoth the Alex MacCaw:
    > >>> Does anyone have an example of splitting a name into first and last
    > >>> names? Or is just a case of doing string.split(' ')?
    > >>
    > >> I'd say a regexp is overkill here.
    > >>
    > >> irb(main):001:0> name = "Alex MacCaw"
    > >> => "Alex MacCaw"
    > >> irb(main):002:0> first, last = name.split
    > >> => ["Alex", "MacCaw"]
    > >> irb(main):003:0> first
    > >> => "Alex"
    > >> irb(main):004:0> last
    > >> => "MacCaw"
    > >>
    > >> Note that you will have to do more work to accommodate middle names and
    > >> titles, ie: Mr, Mrs, Dr etc...

    > >
    > > And also last names with spaces in them (von Trapp, Vaughn Williams,
    > > etc.).
    > >

    > And titles with spaces in them (The Honourable, His Excellency, etc...).
    >


    And international names (though the US seems to have a broad
    assortment of them already)
     
    Michael Fellinger, Jun 25, 2007
    #5
  6. Alex MacCaw

    Alex Young Guest

    Michael Fellinger wrote:
    > On 6/25/07, Alex Young <> wrote:
    >> wrote:
    >> > Hi --
    >> >
    >> > On Sun, 24 Jun 2007, darren kirby wrote:
    >> >
    >> >> quoth the Alex MacCaw:
    >> >>> Does anyone have an example of splitting a name into first and last
    >> >>> names? Or is just a case of doing string.split(' ')?
    >> >>
    >> >> I'd say a regexp is overkill here.
    >> >>
    >> >> irb(main):001:0> name = "Alex MacCaw"
    >> >> => "Alex MacCaw"
    >> >> irb(main):002:0> first, last = name.split
    >> >> => ["Alex", "MacCaw"]
    >> >> irb(main):003:0> first
    >> >> => "Alex"
    >> >> irb(main):004:0> last
    >> >> => "MacCaw"
    >> >>
    >> >> Note that you will have to do more work to accommodate middle names

    >> and
    >> >> titles, ie: Mr, Mrs, Dr etc...
    >> >
    >> > And also last names with spaces in them (von Trapp, Vaughn Williams,
    >> > etc.).
    >> >

    >> And titles with spaces in them (The Honourable, His Excellency, etc...).
    >>

    >
    > And international names (though the US seems to have a broad
    > assortment of them already)
    >

    Can open. Worms everywhere. :)

    --
    Alex
     
    Alex Young, Jun 25, 2007
    #6
  7. On 23/06/07, darren kirby <> wrote:
    > quoth the Alex MacCaw:
    > > Does anyone have an example of splitting a name into first and last
    > > names? Or is just a case of doing string.split(' ')?

    >
    > I'd say a regexp is overkill here.
    >
    > irb(main):001:0> name = "Alex MacCaw"
    > => "Alex MacCaw"
    > irb(main):002:0> first, last = name.split
    > => ["Alex", "MacCaw"]
    > irb(main):003:0> first
    > => "Alex"
    > irb(main):004:0> last
    > => "MacCaw"
    >
    > Note that you will have to do more work to accommodate middle names and
    > titles, ie: Mr, Mrs, Dr etc...
    >
    > -d
    > --
    > darren kirby :: Part of the problem since 1976 :: http://badcomputer.org
    > "...the number of UNIX installations has grown to 10, with more expected..."
    > - Dennis Ritchie and Ken Thompson, June 1972
    >
    >



    name = "Mr John Joe Peter Smith"
    TITLES = ["Mr", "Mrs", "Ms", "Dr"]
    a = name.split
    last = a.pop
    title = a.shift if TITLES.include? a.first
    first = a.shift
    middles = a

    title #=> "Mr"
    first #=> "John"
    middles #=> ["Joe", "Peter"]
    last #=> Smith"
     
    Dan Stevens (IAmAI), Jun 25, 2007
    #7
  8. Alex MacCaw

    Guest

    Hi --

    On Mon, 25 Jun 2007, Dan Stevens (IAmAI) wrote:

    > name = "Mr John Joe Peter Smith"
    > TITLES = ["Mr", "Mrs", "Ms", "Dr"]
    > a = name.split
    > last = a.pop
    > title = a.shift if TITLES.include? a.first


    Have mercy on us Yanks and allow for a period :)

    > first = a.shift
    > middles = a
    >
    > title #=> "Mr"
    > first #=> "John"
    > middles #=> ["Joe", "Peter"]
    > last #=> Smith"


    However:

    name = "Mr Andrew Lloyd Webber"

    # etc.

    title #=> "Mr"
    first #=> "Andrew"
    middles #=> ["Lloyd"] (wrong)
    last #=> Webber" (wrong)


    David

    --
    * Books:
    RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242)
    RUBY FOR RAILS (http://www.manning.com/black)
    * Ruby/Rails training
    & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)
     
    , Jun 25, 2007
    #8
  9. Alex MacCaw

    Alex Young Guest

    wrote:
    > Hi --
    >
    > On Mon, 25 Jun 2007, Dan Stevens (IAmAI) wrote:
    >
    >> name = "Mr John Joe Peter Smith"
    >> TITLES = ["Mr", "Mrs", "Ms", "Dr"]
    >> a = name.split
    >> last = a.pop
    >> title = a.shift if TITLES.include? a.first

    >
    > Have mercy on us Yanks and allow for a period :)
    >
    >> first = a.shift
    >> middles = a
    >>
    >> title #=> "Mr"
    >> first #=> "John"
    >> middles #=> ["Joe", "Peter"]
    >> last #=> Smith"

    >
    > However:
    >
    > name = "Mr Andrew Lloyd Webber"
    >
    > # etc.
    >
    > title #=> "Mr"
    > first #=> "Andrew"
    > middles #=> ["Lloyd"] (wrong)
    > last #=> Webber" (wrong)
    >


    name = "The Honourable Lord Andrew, the Baron Lloyd-Webber of
    Sydmonton", you mean? It's hard to come up with a trickier example.
    Names are just *hard* - the only reliable way of handling them that I've
    found is to let users control it themselves...

    --
    Alex
     
    Alex Young, Jun 25, 2007
    #9
  10. > Names are just *hard* - the only reliable way of handling them that I've
    > found is to let users control it themselves...


    Agreed. My example makes very simple assumptions that I'd imagine
    apply to the vast majority of names. However, in many computer
    problems there are obscure exceptions that either break the program or
    break things for the user.

    On 25/06/07, Alex Young <> wrote:
    > wrote:
    > > Hi --
    > >
    > > On Mon, 25 Jun 2007, Dan Stevens (IAmAI) wrote:
    > >
    > >> name = "Mr John Joe Peter Smith"
    > >> TITLES = ["Mr", "Mrs", "Ms", "Dr"]
    > >> a = name.split
    > >> last = a.pop
    > >> title = a.shift if TITLES.include? a.first

    > >
    > > Have mercy on us Yanks and allow for a period :)
    > >
    > >> first = a.shift
    > >> middles = a
    > >>
    > >> title #=> "Mr"
    > >> first #=> "John"
    > >> middles #=> ["Joe", "Peter"]
    > >> last #=> Smith"

    > >
    > > However:
    > >
    > > name = "Mr Andrew Lloyd Webber"
    > >
    > > # etc.
    > >
    > > title #=> "Mr"
    > > first #=> "Andrew"
    > > middles #=> ["Lloyd"] (wrong)
    > > last #=> Webber" (wrong)
    > >

    >
    > name = "The Honourable Lord Andrew, the Baron Lloyd-Webber of
    > Sydmonton", you mean? It's hard to come up with a trickier example.
    > Names are just *hard* - the only reliable way of handling them that I've
    > found is to let users control it themselves...
    >
    > --
    > Alex
    >
    >
     
    Dan Stevens (IAmAI), Jun 25, 2007
    #10
  11. Alex MacCaw

    Alex LeDonne Guest

    On 6/25/07, Dan Stevens (IAmAI) <> wrote:
    > On 25/06/07, Alex Young <> wrote:
    > > Names are just *hard* - the only reliable way of handling them that I've
    > > found is to let users control it themselves...

    >
    > Agreed. My example makes very simple assumptions that I'd imagine
    > apply to the vast majority of names. However, in many computer
    > problems there are obscure exceptions that either break the program or
    > break things for the user.


    I worked at an institution that was forced to rewrite a bunch of
    name-related code for a legacy system because of a "sanity" check that
    was just plain wrong... and nobody realized it until Dr. O came to
    work. Now they had to allow one-letter surnames, too (they'd already
    allowed one-letter given or middle names, thanks to President Truman's
    middle name, S).

    Almost any assumption you make about name parsing will be wrong. For
    example, take the assumption that names are composed only of letters
    and letter-like symbols.

    http://en.wikipedia.org/wiki/Jennifer_8._Lee
    http://en.wikipedia.org/wiki/Nancy_3._Hoffman
    http://en.wikipedia.org/wiki/List_of_personal_names_that_contain_numbers

    -Alex
     
    Alex LeDonne, Jun 25, 2007
    #11
  12. Alex LeDonne wrote:
    > Almost any assumption you make about name parsing will be wrong. For
    > example, take the assumption that names are composed only of letters
    > and letter-like symbols.


    Not to mention the assumption that each name consists of symbols that
    are part of some character set:

    http://upload.wikimedia.org/wikiped.../Prince_symbol.svg/20px-Prince_symbol.svg.png

    --
    vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
     
    Joel VanderWerf, Jun 25, 2007
    #12
  13. On 6/25/07, Alex LeDonne <> wrote:

    > I worked at an institution that was forced to rewrite a bunch of
    > name-related code for a legacy system because of a "sanity" check that
    > was just plain wrong... and nobody realized it until Dr. O came to
    > work. Now they had to allow one-letter surnames, too (they'd already
    > allowed one-letter given or middle names, thanks to President Truman's
    > middle name, S).


    Reminds me of an old SF story "The Man Whose Name Wouldn't Fit." IIRC
    it started with a guy getting fired because his company put in a new
    computer personnel data system which had one too many characters in
    the field for last name to accomodate him (and it was too expensive to
    fix).

    I think it ended with a neo-luddite movement with a secret weapon
    which dissolved the bond between the the magnetic material and the
    substrate on magnetic tapes and disks.

    Don't know how many here are old enough to remember when most
    computers used magnetic tape. <G>

    --
    Rick DeNatale

    My blog on Ruby
    http://talklikeaduck.denhaven2.com/
     
    Rick DeNatale, Jun 27, 2007
    #13
  14. Alex MacCaw

    Guest

    Hi --

    On Thu, 28 Jun 2007, Rick DeNatale wrote:

    > On 6/25/07, Alex LeDonne <> wrote:
    >
    >> I worked at an institution that was forced to rewrite a bunch of
    >> name-related code for a legacy system because of a "sanity" check that
    >> was just plain wrong... and nobody realized it until Dr. O came to
    >> work. Now they had to allow one-letter surnames, too (they'd already
    >> allowed one-letter given or middle names, thanks to President Truman's
    >> middle name, S).

    >
    > Reminds me of an old SF story "The Man Whose Name Wouldn't Fit." IIRC
    > it started with a guy getting fired because his company put in a new
    > computer personnel data system which had one too many characters in
    > the field for last name to accomodate him (and it was too expensive to
    > fix).
    >
    > I think it ended with a neo-luddite movement with a secret weapon
    > which dissolved the bond between the the magnetic material and the
    > substrate on magnetic tapes and disks.
    >
    > Don't know how many here are old enough to remember when most
    > computers used magnetic tape. <G>


    I sometimes wonder whether the DECtapes in my attic would still be
    readable.


    David

    --
    * Books:
    RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242)
    RUBY FOR RAILS (http://www.manning.com/black)
    * Ruby/Rails training
    & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)
     
    , Jun 27, 2007
    #14
  15. Rick DeNatale wrote:
    > On 6/25/07, Alex LeDonne <> wrote:
    >
    >> I worked at an institution that was forced to rewrite a bunch of
    >> name-related code for a legacy system because of a "sanity" check that
    >> was just plain wrong... and nobody realized it until Dr. O came to
    >> work. Now they had to allow one-letter surnames, too (they'd already
    >> allowed one-letter given or middle names, thanks to President Truman's
    >> middle name, S).

    >
    > Reminds me of an old SF story "The Man Whose Name Wouldn't Fit." IIRC
    > it started with a guy getting fired because his company put in a new
    > computer personnel data system which had one too many characters in
    > the field for last name to accomodate him (and it was too expensive to
    > fix).
    >

    I assume you meant that his name was one character too long, not that
    the field for the name was too long.

    > I think it ended with a neo-luddite movement with a secret weapon
    > which dissolved the bond between the the magnetic material and the
    > substrate on magnetic tapes and disks.
    >
    > Don't know how many here are old enough to remember when most
    > computers used magnetic tape. <G>
    >

    I remember carrying boxes of punched cards. One of my previous bosses
    told me a humorous horror story about a company he worked for. The
    company magazine was doing an article on the Data Processing department
    and wanted a picture of the computers at work. The department decided
    that the best time would be during the payroll run when all of the tape
    drives would be in use. They set up the shot and when the flash went
    off all of the tape drives went off line. Everyone had forgotten about
    the optical EOT sensors.
     
    Michael W. Ryder, Jun 27, 2007
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    2
    Views:
    480
  2. Carlos Ribeiro
    Replies:
    11
    Views:
    720
    Alex Martelli
    Sep 17, 2004
  3. trans.  (T. Onoma)

    split on '' (and another for split -1)

    trans. (T. Onoma), Dec 27, 2004, in forum: Ruby
    Replies:
    10
    Views:
    227
    Florian Gross
    Dec 28, 2004
  4. Sam Kong
    Replies:
    5
    Views:
    261
    Rick DeNatale
    Aug 12, 2006
  5. Joao Silva
    Replies:
    16
    Views:
    390
    7stud --
    Aug 21, 2009
Loading...

Share This Page