FAQ 4.31 How can I split a [character] delimited string except when inside [character]?

Discussion in 'Perl Misc' started by PerlFAQ Server, Apr 13, 2011.

  1. This is an excerpt from the latest version perlfaq4.pod, which
    comes with the standard Perl distribution. These postings aim to
    reduce the number of repeated questions as well as allow the community
    to review and update the answers. The latest version of the complete
    perlfaq is at http://faq.perl.org .

    --------------------------------------------------------------------

    4.31: How can I split a [character] delimited string except when inside [character]?

    Several modules can handle this sort of parsing--"Text::Balanced",
    "Text::CSV", "Text::CSV_XS", and "Text::parseWords", among others.

    Take the example case of trying to split a string that is
    comma-separated into its different fields. You can't use "split(/,/)"
    because you shouldn't split if the comma is inside quotes. For example,
    take a data line like this:

    SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped"

    Due to the restriction of the quotes, this is a fairly complex problem.
    Thankfully, we have Jeffrey Friedl, author of *Mastering Regular
    Expressions*, to handle these for us. He suggests (assuming your string
    is contained in $text):

    @new = ();
    push(@new, $+) while $text =~ m{
    "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside the quotes
    | ([^,]+),?
    | ,
    }gx;
    push(@new, undef) if substr($text,-1,1) eq ',';

    If you want to represent quotation marks inside a
    quotation-mark-delimited field, escape them with backslashes (eg, "like
    \"this\"".

    Alternatively, the "Text::parseWords" module (part of the standard Perl
    distribution) lets you say:

    use Text::parseWords;
    @new = quotewords(",", 0, $text);



    --------------------------------------------------------------------

    The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
    are not necessarily experts in every domain where Perl might show up,
    so please include as much information as possible and relevant in any
    corrections. The perlfaq-workers also don't have access to every
    operating system or platform, so please include relevant details for
    corrections to examples that do not work on particular platforms.
    Working code is greatly appreciated.

    If you'd like to help maintain the perlfaq, see the details in
    perlfaq.pod.
     
    PerlFAQ Server, Apr 13, 2011
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ajit

    split a tab delimited string

    Ajit, Jul 28, 2003, in forum: ASP .Net
    Replies:
    6
    Views:
    20,049
  2. John Salerno
    Replies:
    20
    Views:
    882
    John Salerno
    Aug 11, 2006
  3. Angus
    Replies:
    6
    Views:
    2,191
    Dag Sunde
    Nov 17, 2006
  4. RyanL
    Replies:
    6
    Views:
    724
    Paul McGuire
    Aug 28, 2007
  5. PerlFAQ Server
    Replies:
    0
    Views:
    414
    PerlFAQ Server
    Jan 25, 2011
Loading...

Share This Page