Getting rid of punctuation in chunked strings

Discussion in 'Perl Misc' started by Stevee, Dec 9, 2005.

  1. Stevee

    Stevee Guest

    Hi all.

    Apologies if this is a newbie question, but I am new!

    I am reading in a string, splitting it into chunks on whitespace and
    placing the values in an array which I then process further to match.
    I am having problems because some of the matches are not working
    because the chunking gives things like

    "martin, "

    or

    "martin. "

    i.e there is a comma or full stop as the end of the chunk.

    Any ideas how to remove the punctuation before I put the values in the
    array to match?

    Thanks in advance.
     
    Stevee, Dec 9, 2005
    #1
    1. Advertising

  2. "Stevee" <> wrote in
    news::

    > Apologies if this is a newbie question, but I am new!
    >
    > I am reading in a string, splitting it into chunks on whitespace and
    > placing the values in an array which I then process further to match.
    > I am having problems because some of the matches are not working
    > because the chunking gives things like
    >
    > "martin, "
    >
    > or
    >
    > "martin. "
    >
    > i.e there is a comma or full stop as the end of the chunk.
    >
    > Any ideas how to remove the punctuation before I put the values in the
    > array to match?


    Use split.

    #!/usr/bin/perl

    use strict;
    use warnings;

    my $str = <<EO_TEXT;
    I, being the obnoxious person that I am, will ask
    Mr. Steeve to please read the posting guidelines,
    given that he is new to this group.
    EO_TEXT

    my @words = split /[[:punct:]]?\s+[[:punct:]]?/, $str;

    {
    local $" = '#';
    print "@words\n";
    }
    __END__

    D:\Home\asu1\UseNet\clpmisc> tt
    I#being#the#obnoxious#person#that#I#am#will#ask#Mr#Steeve#to#please#read
    #the#posting#guidelines#given#that#he#is#new#to#this#group
    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Dec 9, 2005
    #2
    1. Advertising

  3. Stevee

    Anno Siegel Guest

    Stevee <> wrote in comp.lang.perl.misc:
    > Hi all.
    >
    > Apologies if this is a newbie question, but I am new!
    >
    > I am reading in a string, splitting it into chunks on whitespace and
    > placing the values in an array which I then process further to match.
    > I am having problems because some of the matches are not working
    > because the chunking gives things like
    >
    > "martin, "
    >
    > or
    >
    > "martin. "


    No, it doesn't, not if you split on white space. Your examples
    *contain* white space. Show your code so it is clear what you
    are doing.

    > i.e there is a comma or full stop as the end of the chunk.
    >
    > Any ideas how to remove the punctuation before I put the values in the
    > array to match?


    You could split on a combination of white space and punctuation:

    my $sentence = "Martin, Martin. O'Brien!";
    print "$_\n" for split /[[:space:][:punct:]]+/, $sentence;

    ....or maybe not. You need to define what is punctuation and what
    isn't.

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
     
    Anno Siegel, Dec 9, 2005
    #3
  4. Stevee

    robic0 Guest

    On 9 Dec 2005 04:51:21 -0800, "Stevee"
    <> wrote:

    >Hi all.
    >
    >Apologies if this is a newbie question, but I am new!
    >
    >I am reading in a string, splitting it into chunks on whitespace and
    >placing the values in an array which I then process further to match.
    >I am having problems because some of the matches are not working
    >because the chunking gives things like
    >
    >"martin, "
    >
    >or
    >
    >"martin. "
    >
    >i.e there is a comma or full stop as the end of the chunk.
    >
    >Any ideas how to remove the punctuation before I put the values in the
    >array to match?
    >
    >Thanks in advance.

    Nobody knows what punctuation is. Search the internet for punctuation.
    When you can define it, then your %99 there.
    (Notice I didn't post any bullshit code like the other slackers?)
     
    robic0, Dec 10, 2005
    #4
  5. Stevee

    robic0 Guest

    On Fri, 09 Dec 2005 22:22:58 -0800, robic0 wrote:

    >On 9 Dec 2005 04:51:21 -0800, "Stevee"
    ><> wrote:
    >
    >>Hi all.
    >>
    >>Apologies if this is a newbie question, but I am new!
    >>
    >>I am reading in a string, splitting it into chunks on whitespace and
    >>placing the values in an array which I then process further to match.
    >>I am having problems because some of the matches are not working
    >>because the chunking gives things like
    >>
    >>"martin, "
    >>
    >>or
    >>
    >>"martin. "
    >>
    >>i.e there is a comma or full stop as the end of the chunk.
    >>
    >>Any ideas how to remove the punctuation before I put the values in the
    >>array to match?
    >>
    >>Thanks in advance.

    >Nobody knows what punctuation is. Search the internet for punctuation.
    >When you can define it, then your %99 there.
    >(Notice I didn't post any bullshit code like the other slackers?)

    Time for a gut check upload of the King James Bible
     
    robic0, Dec 10, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. CW

    transfer encoding: chunked

    CW, Feb 15, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    563
  2. Replies:
    0
    Views:
    1,077
  3. Replies:
    10
    Views:
    7,806
  4. Replies:
    0
    Views:
    375
  5. Philip Semanchuk
    Replies:
    0
    Views:
    442
    Philip Semanchuk
    Mar 9, 2006
Loading...

Share This Page