Binding array to pattern

Discussion in 'Perl Misc' started by Shmuel (Seymour J.) Metz, Jun 8, 2006.

  1. I'd like to bind an array to a pattern. I couldn't find anything in
    the Camel book about the context for the left side of the binding
    operator. I ran some tests, and it appears that I get scalar context
    if I write

    while (@anarray =~ /pattern/g) {
    block;
    }

    which means that I match against the size rather than the contents. I
    tried

    while ("@anarray" =~ /pattern/g) {
    block;
    }

    but that went into a loop. Is there a better way to do this than

    $_="@anarray";
    while (/pattern/g) {
    block;
    }

    ?

    Is there a description that I missed of interpolation for the left
    side of the binding operator?

    --
    Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

    Unsolicited bulk E-mail subject to legal action. I reserve the
    right to publicly post or ridicule any abusive E-mail. Reply to
    domain Patriot dot net user shmuel+news to contact me. Do not
    reply to
     
    Shmuel (Seymour J.) Metz, Jun 8, 2006
    #1
    1. Advertising

  2. Shmuel (Seymour J.) Metz

    Guest

    "Shmuel (Seymour J.) Metz" <> wrote:
    > I'd like to bind an array to a pattern. I couldn't find anything in
    > the Camel book about the context for the left side of the binding
    > operator. I ran some tests, and it appears that I get scalar context
    > if I write


    I don't know about the camel book, but from perldoc perlop:
    Binding Operators
    Binary "=~" binds a scalar
    expression to a pattern
    match.

    So yes, it is a scalar.

    >
    > while ("@anarray" =~ /pattern/g) {
    > block;
    > }
    >
    > but that went into a loop.


    Of course it did, what with the while there and all. Oh, you mean
    an infinite loop. Yep, it does seem to. But then again, so does:

    while ("$_" =~ /pattern/g) {

    So apparently the reinterpolation is performed each time and thus the
    string is not known to be the same.


    > Is there a better way to do this than
    >
    > $_="@anarray";
    > while (/pattern/g) {
    > block;
    > }


    I have no idea why you want to do it in the first place, but I can't think
    of a better way to convert an array into a string and then repeatedly
    matching on it in a while loop. (I guess localizing $_ first might be a
    good idea.) I guess one possibly better alternative would be:

    foreach ("@anarray" =~ /pattern/g) {

    As it seems unlikely that the intermediate list would break the memory
    bank.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Jun 8, 2006
    #2
    1. Advertising

  3. Shmuel (Seymour J.) Metz wrote:
    > I'd like to bind an array to a pattern.


    Why?

    > Is there a better way to do this than
    >
    > $_="@anarray";
    > while (/pattern/g) {
    > block;
    > }
    >
    > ?


    This is more readable IMO:

    foreach my $element ( @anarray ) {
    while ( $element =~ /PATTERN/g ) {
    ...
    }
    }

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Jun 9, 2006
    #3
  4. Shmuel (Seymour J.) Metz <> wrote:

    > I'd like to bind an array to a pattern.



    That makes no sense.

    A pattern match is *defined* to operate on a string. An array
    is not a string.

    Why would you like to bind an array to a pattern?

    (and what is your new definition of "pattern match" to go along with it?)

    Do you instead want to apply a pattern match to each _element_
    of an array? If so, then use foreach or grep.


    > I
    > tried
    >
    > while ("@anarray" =~ /pattern/g) {
    > block;
    > }
    >
    > but that went into a loop.



    Errr, the "while" construct _is_ a loop. If you don't want a loop,
    then don't use "while".


    > Is there a description that I missed of interpolation for the left
    > side of the binding operator?



    Interpolation has nothing to do with any of Perl's other operators.

    Interpolation happens with "strings" without regard to what
    operator the string is an operand for.



    What is it that you are ultimately trying to achieve?


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jun 9, 2006
    #4
  5. Shmuel (Seymour J.) Metz wrote:
    > I'd like to bind an array to a pattern. I couldn't find anything in
    > the Camel book about the context for the left side of the binding
    > operator. I ran some tests, and it appears that I get scalar context
    > if I write
    >
    > while (@anarray =~ /pattern/g) {
    > block;
    > }
    >
    > which means that I match against the size rather than the contents. I
    > tried
    >
    > while ("@anarray" =~ /pattern/g) {
    > block;
    > }
    >
    > but that went into a loop.


    There problem is that you are using an expression with =~ //g

    There was a similar problem here a while back.

    http://groups.google.com/group/comp..._frm/thread/1cdea0dc1313b9b7/9bdc49d7f9d7bf31

    > Is there a description that I missed of interpolation for the left
    > side of the binding operator?


    It's not the fact that it's interpolation, it's the fact that it's an
    rvalue expression so each time round the while() loop the =~ is binding
    to a new string and the /g position pointer is starts again at zero.
     
    Brian McCauley, Jun 9, 2006
    #5
  6. In <>, on 06/09/2006
    at 01:30 AM, Gunnar Hjalmarsson <> said:

    >Shmuel (Seymour J.) Metz wrote:
    >> I'd like to bind an array to a pattern.


    >Why?


    I want to test for a match anywhere in the array.

    >This is more readable IMO:
    > foreach my $element ( @anarray ) {
    > while ( $element =~ /PATTERN/g ) {
    > ...
    > }
    > }


    I had simplified my code because I was primarily concerned with the
    endless loop rather than style. What I actually wound up with[1] was

    foreach (sort keys %{$host_info{$host}{Email}}) {
    push @Contacts, @{$host_info{$host}{Email}{$_}};
    my $scalarContacts="@{$host_info{$host}{Email}{$_}}";
    push @abuseContacts, @{$host_info{$host}{Email}{$_}}
    if (/abuse/ or $scalarContacts =~ /abuse/);
    }

    and I'd rather avoid replicating the push statement. Given that, is
    there a better style?

    Thanks.

    [1] The code is quck and dirty and at some point I intend to
    do some massive cleanup, but it's still a work in progress.

    --
    Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

    Unsolicited bulk E-mail subject to legal action. I reserve the
    right to publicly post or ridicule any abusive E-mail. Reply to
    domain Patriot dot net user shmuel+news to contact me. Do not
    reply to
     
    Shmuel (Seymour J.) Metz, Jun 9, 2006
    #6
  7. In <>, on 06/08/2006
    at 07:33 PM, Tad McClellan <> said:

    >That makes no sense.


    To you. Others had no trouble understanding it.

    >Why would you like to bind an array to a pattern?


    Because I want all of the matches on all of the strings in the array.

    >(and what is your new definition of "pattern match" to go along with
    >it?)


    What is your definition of "new", and are you really older than
    Griswold?

    >Do you instead want to apply a pattern match to each _element_ of an
    >array?


    That would be obvious if you looked at the code. Do you know how
    interpolation works inside quotes?

    >If so, then use foreach or grep.


    That would complicate the logic in this case.

    >Errr, the "while" construct _is_ a loop. If you don't want a loop,
    >then don't use "while".


    Sorry, I meant nonterminating loop. Another poster has explained the
    problem.

    >What is it that you are ultimately trying to achieve?


    Scan an array for a pattern as part of a larger boolean expression.
    The code that I posted was part of debug scaffolding that I wrote
    while trying to resolve the original problem.

    --
    Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

    Unsolicited bulk E-mail subject to legal action. I reserve the
    right to publicly post or ridicule any abusive E-mail. Reply to
    domain Patriot dot net user shmuel+news to contact me. Do not
    reply to
     
    Shmuel (Seymour J.) Metz, Jun 9, 2006
    #7
  8. In <>, on
    06/09/2006
    at 05:17 AM, "Brian McCauley" <> said:

    >It's not the fact that it's interpolation, it's the fact that it's an
    >rvalue expression so each time round the while() loop the =~ is
    >binding to a new string and the /g position pointer is starts again
    >at zero.


    Thanks.

    --
    Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

    Unsolicited bulk E-mail subject to legal action. I reserve the
    right to publicly post or ridicule any abusive E-mail. Reply to
    domain Patriot dot net user shmuel+news to contact me. Do not
    reply to
     
    Shmuel (Seymour J.) Metz, Jun 9, 2006
    #8
  9. In <20060608191324.722$>, on 06/08/2006
    at 10:08 PM, said:

    >So apparently the reinterpolation is performed each time and thus the
    >string is not known to be the same.


    Ouch!

    >Of course it did, what with the while there and all. Oh, you mean
    >an infinite loop.


    Well, it ends when I do ^c ;-) Sorry, I should have been clearer.

    >I have no idea why you want to do it in the first place,


    I need a term in a boolean expression for a match anywhere in the
    array.

    >I guess one possibly better alternative would be:
    >foreach ("@anarray" =~ /pattern/g) {


    That would have the wrong semantics even if it didn't go into an
    endless loop. IAC, the code that I posted was test cases intended to
    help me track down the original problem. The original failing code was
    a match inside a boolean expression, and I have currently changed it
    to the following:

    foreach (sort keys %{$host_info{$host}{Email}}) {
    push @Contacts, @{$host_info{$host}{Email}{$_}};
    my $scalarContacts="@{$host_info{$host}{Email}{$_}}";
    push @abuseContacts, @{$host_info{$host}{Email}{$_}}
    if (/abuse/ or $scalarContacts =~ /abuse/);
    }

    I could use

    foreach my $type (sort keys %{$host_info{$host}{Email}}) {

    and throw in a nested

    foreach (@{$host_info{$host}{Email}{$type}}) {

    but I'd consider replicating the push to be uglier than coercing the
    array to a string.

    Thanks.

    --
    Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

    Unsolicited bulk E-mail subject to legal action. I reserve the
    right to publicly post or ridicule any abusive E-mail. Reply to
    domain Patriot dot net user shmuel+news to contact me. Do not
    reply to
     
    Shmuel (Seymour J.) Metz, Jun 9, 2006
    #9
  10. Shmuel (Seymour J.) Metz

    Uri Guttman Guest

    >>>>> "SJM" == Shmuel (Seymour J ) Metz <> writes:

    SJM> foreach (sort keys %{$host_info{$host}{Email}}) {
    SJM> push @Contacts, @{$host_info{$host}{Email}{$_}};
    SJM> my $scalarContacts="@{$host_info{$host}{Email}{$_}}";
    SJM> push @abuseContacts, @{$host_info{$host}{Email}{$_}}
    SJM> if (/abuse/ or $scalarContacts =~ /abuse/);
    SJM> }

    SJM> and I'd rather avoid replicating the push statement. Given that, is
    SJM> there a better style?

    disregarding the loop issues, that is very hard to read code. notice the
    massive redundant use of $host_info{$host}{Email} in there? factor that
    out into a scalar before the loop. and then it can become almost
    readable (with some needed whitespace too)

    my $email_info = $host_info{$host}{Email} ;

    foreach (sort keys %{$email_info}) {

    my $emails = $email_info->{$_} ;
    push @Contacts, @{$emails};
    push @abuseContacts, @{$emails};
    my $scalarContacts = "@{$emails}";

    if (/abuse/ or $scalarContacts =~ /abuse/);

    that method of checking a joined string vs scanning a array bothers
    me. and why do you push the same stuff into 2 different arrays?
    if you used List::Utils::first you can scan for the first abuse email in
    the array and it could be faster as you don't make up the string first.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Jun 9, 2006
    #10
  11. Shmuel (Seymour J.) Metz <> wrote:

    > I need a term in a boolean expression for a match anywhere in the
    > array.



    > if (/abuse/ or $scalarContacts =~ /abuse/);



    if grep /abuse/, $_, @{$host_info{$host}{Email}{$_}};


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jun 9, 2006
    #11
  12. Shmuel (Seymour J.) Metz <> wrote:
    > In <>, on 06/08/2006
    > at 07:33 PM, Tad McClellan <> said:
    >
    >>That makes no sense.



    [ The "that" was snipped, it was:
    I'd like to bind an array to a pattern.
    ]

    > To you. Others had no trouble understanding it.



    Neither you nor any of the others understood "bind an array to a pattern".

    The rest of your OP below here, and the followups I've seen so far,
    where all with binding a string to a pattern (a string (supposedly)
    made up of strings taken from some array).


    >>Why would you like to bind an array to a pattern?

    >
    > Because I want all of the matches on all of the strings in the array.



    My suggestion would do that for you.


    >>Do you instead want to apply a pattern match to each _element_ of an
    >>array?

    >
    > That would be obvious if you looked at the code.



    It was not obvious if you _understood_ the code.

    I looked at it and saw that

    @array = ('a', 'b');
    print "true" if "@array" =~ /a b/g;

    would fail to do the Right Thing.


    > Do you know how
    > interpolation works inside quotes?



    Better than you do, apparently.


    >>If so, then use foreach or grep.

    >
    > That would complicate the logic in this case.



    No it wouldn't.


    > Scan an array for a pattern as part of a larger boolean expression.



    Use grep(), just like I suggested then.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jun 9, 2006
    #12
  13. In <>, on 06/09/2006
    at 05:20 PM, Uri Guttman <> said:

    >disregarding the loop issues, that is very hard to read code. notice
    >the massive redundant use of $host_info{$host}{Email} in there?
    >factor that out into a scalar before the loop. and then it can become
    >almost readable (with some needed whitespace too)


    Thanks.

    > my $email_info = $host_info{$host}{Email} ;


    I assume that $email_info will b e a reference, so that stores into it
    will go into $host_info?

    > push @Contacts, @{$emails};
    > push @abuseContacts, @{$emails};


    No; that would change the logic. What would be needed is:

    push @abuseContacts, @{$emails};
    my $scalarContacts = "@{$emails}"
    if (/abuse/ or $scalarContacts =~ /abuse/);

    >and why do you push the same stuff into 2 different arrays?


    I don't; the 2nd push is conditional.

    >if you used List::Utils::first you can scan for the first abuse
    >email in the array and it could be faster as you don't make up the
    >string first.


    It's more complicated than that; there are two arrays and if the
    second has any entries then I need to use it in place of the first. So
    I'd need to use something like grep to extract the abuse entries if I
    didn't select them out into a separate array.

    if (@abuseContacts) {
    print $fhLookup "Abuse contacts: ", join(',
    ',@abuseContacts),"\n";
    } else {
    print $fhLookup "contacts: ", join(', ',@Contacts),"\n";
    }

    Just for laughs I checked Hash::Util to see if there was an equivalent
    to List::Utils::first, but no joy.

    --
    Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

    Unsolicited bulk E-mail subject to legal action. I reserve the
    right to publicly post or ridicule any abusive E-mail. Reply to
    domain Patriot dot net user shmuel+news to contact me. Do not
    reply to
     
    Shmuel (Seymour J.) Metz, Jun 15, 2006
    #13
  14. Shmuel (Seymour J.) Metz

    Uri Guttman Guest

    >>>>> "SM" == Shmuel (Seymour J ) Metz <> writes:

    >> my $email_info = $host_info{$host}{Email} ;


    SM> I assume that $email_info will b e a reference, so that stores into it
    SM> will go into $host_info?

    it has to be a reference as you created a ref there earlier when you
    built the structure. all scalar values in a structure that hold lower
    level structures are refs.

    >> push @Contacts, @{$emails};
    >> push @abuseContacts, @{$emails};


    SM> No; that would change the logic. What would be needed is:

    SM> push @abuseContacts, @{$emails};
    SM> my $scalarContacts = "@{$emails}"
    SM> if (/abuse/ or $scalarContacts =~ /abuse/);

    that is wrong. you didn't make $scalarContacts before you tested it.

    >> and why do you push the same stuff into 2 different arrays?


    SM> I don't; the 2nd push is conditional.

    but based on wrong logic. i still don't know your goals here.

    >> if you used List::Utils::first you can scan for the first abuse
    >> email in the array and it could be faster as you don't make up the
    >> string first.


    SM> It's more complicated than that; there are two arrays and if the
    SM> second has any entries then I need to use it in place of the first. So
    SM> I'd need to use something like grep to extract the abuse entries if I
    SM> didn't select them out into a separate array.

    that is very unclear to me. you need to learn how to express your
    requirements better. that makes it much easier to code to them. i can't
    read your mind so you have to explain it in very clear english.

    SM> if (@abuseContacts) {
    SM> print $fhLookup "Abuse contacts: ", join(',
    SM> ',@abuseContacts),"\n";
    SM> } else {
    SM> print $fhLookup "contacts: ", join(', ',@Contacts),"\n";
    SM> }

    SM> Just for laughs I checked Hash::Util to see if there was an equivalent
    SM> to List::Utils::first, but no joy.

    that makes no sense as there is no order in hashes so first can't
    exist. you want any() from quantum::superpositions or one of the perl6
    modules.

    again, please write up VERY clear requirements as it will help you and
    us.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Jun 15, 2006
    #14
  15. Shmuel (Seymour J.) Metz <> wrote:
    > In <>, on 06/09/2006
    > at 05:20 PM, Uri Guttman <> said:


    >>the massive redundant use of $host_info{$host}{Email} in there?
    >>factor that out into a scalar before the loop.



    >> push @Contacts, @{$emails};
    >> push @abuseContacts, @{$emails};

    >
    > No; that would change the logic. What would be needed is:
    >
    > push @abuseContacts, @{$emails};
    > my $scalarContacts = "@{$emails}"
    > if (/abuse/ or $scalarContacts =~ /abuse/);



    You flipped the lines from what you posted before, which was:

    push @Contacts, @{$host_info{$host}{Email}{$_}};
    my $scalarContacts="@{$host_info{$host}{Email}{$_}}";
    push @abuseContacts, @{$host_info{$host}{Email}{$_}}
    if (/abuse/ or $scalarContacts =~ /abuse/);


    Applying both Uri's and my suggestions to that code yields:

    push @Contacts, @{$emails};
    push @abuseContacts, @{$emails}
    if grep /abuse/, $_, @{$emails};

    Or, taking advantage of the special case of the reference
    being a simple standalone scalar allowing you to leave out
    some curlies:

    push @Contacts, @$emails;
    push @abuseContacts, @$emails
    if grep /abuse/, $_, @$emails;


    >>and why do you push the same stuff into 2 different arrays?

    >
    > I don't; the 2nd push is conditional.



    Not in the code you posted this time.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jun 15, 2006
    #15
  16. In <>, on 06/15/2006
    at 11:44 AM, Uri Guttman <> said:

    >it has to be a reference as you created a ref there earlier when you
    >built the structure. all scalar values in a structure that hold lower
    >level structures are refs.


    Thanks.

    >that is wrong.


    Whoops! I edited the text from the article instead of doing a
    cut-and-paste directly from my code. Make that

    my $email_contact = $email_info->{$_} ;
    push @Contacts, @{$email_contact};
    my $scalarContacts="@{$email_contact}";
    push @abuseContacts, @{$email_contact}
    if (/abuse/ or $scalarContacts =~ /abuse/);

    >but based on wrong logic.


    Please see above.

    >i still don't know your goals here.


    My goal is to construct one of two messages; one containg only the
    abuse matches and the other containing all of the array element,
    depending on whether there are any abuse matches. I was trying to
    quote only the minimum code needed to provide context, not all 620
    lines.

    >that is very unclear to me.


    I'm extracting e-mail contacts from whois data. In some cases there
    are multiple contacts for the same role. The abuse contacts might have
    the word "abuse" in the addresses or might have it only in the tags.
    If there are abuse contacts then I want to put them in a message;
    otherwise want to put all of the e-mail in a different message. That's
    part of a larger program that deobfuscates a spa e-mail and attempts
    to locate information on the sender and the drop boxes for use in a
    complaint.

    >that makes no sense as there is no order in hashes so first can't
    >exist. you want any() from quantum::superpositions or one of the
    >perl6 modules.


    When will Perl6 be ready for prime time?

    --
    Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

    Unsolicited bulk E-mail subject to legal action. I reserve the
    right to publicly post or ridicule any abusive E-mail. Reply to
    domain Patriot dot net user shmuel+news to contact me. Do not
    reply to
     
    Shmuel (Seymour J.) Metz, Jun 16, 2006
    #16
  17. In <>, on 06/15/2006
    at 12:57 PM, Tad McClellan <> said:

    >You flipped the lines from what you posted before,


    I edited the text of the article instead of clipping from my code, and
    inadvertently left some out. That should be

    my $email_contact = $email_info->{$_} ;
    push @Contacts, @{$email_contact};
    my $scalarContacts="@{$email_contact}";
    push @abuseContacts, @{$email_contact}
    if (/abuse/ or $scalarContacts =~ /abuse/);

    > if grep /abuse/, $_, @$emails;


    How is that better than scanning the derived scalar? Although it might
    be better to do

    my $scalarContacts="$_ @$email_contact";
    push @abuseContacts, @$email_contact
    if $scalarContacts =~ /abuse/;

    --
    Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

    Unsolicited bulk E-mail subject to legal action. I reserve the
    right to publicly post or ridicule any abusive E-mail. Reply to
    domain Patriot dot net user shmuel+news to contact me. Do not
    reply to
     
    Shmuel (Seymour J.) Metz, Jun 16, 2006
    #17
  18. Shmuel (Seymour J.) Metz

    Ben Morrow Guest

    Quoth "Shmuel (Seymour J.) Metz" <>:
    > In <>, on 06/15/2006
    > at 11:44 AM, Uri Guttman <> said:
    >
    > >that is wrong.

    >
    > Whoops! I edited the text from the article instead of doing a
    > cut-and-paste directly from my code. Make that
    >
    > my $email_contact = $email_info->{$_} ;
    > push @Contacts, @{$email_contact};


    > my $scalarContacts="@{$email_contact}";
    > push @abuseContacts, @{$email_contact}
    > if (/abuse/ or $scalarContacts =~ /abuse/);


    These three lines are equivalent to

    push @abuseContacts, @{$email_contact}
    if grep /abuse/, $_, @{$email_contact};

    except that doesn't waste time (both for the machine and the human
    reading the code) stringifying the array; and there aren't problems with

    $email_contact = ['fooab', 'use this'];

    (though I suspect this isn't an issue in your case).

    You do realise this will push *all* of @{$email_contact} onto
    @abuseContact, if *any* of them match? From your description below I
    can't quite see how this could be what you want.

    > I'm extracting e-mail contacts from whois data. In some cases there
    > are multiple contacts for the same role. The abuse contacts might have
    > the word "abuse" in the addresses or might have it only in the tags.
    > If there are abuse contacts then I want to put them in a message;


    By 'abuse contacts' you mean 'email addresses matching /abuse/', right?
    [Side issue: are you sure you mean /abuse/ and not /^abuse\@/ ?]

    > otherwise want to put all of the e-mail in a different message.


    ....so you can read the tags and find the correct addr by hand? Do you
    want the whole whois reply, or just all the email addesses in the reply?

    In any case, I'd do something like (untested)

    my @abuse_addrs;
    my @misc_addrs;
    my @domains = ...;

    for (@domains) {
    my $whois = get_whois_data($_);
    my @emails = extract_email_addrs($whois);

    my @ae = grep /abuse/, @emails;
    if (@ae) {
    push @abuse_addrs, @ae;
    }
    else {
    push @misc_addrs, $whois; # or @emails
    }
    }

    Or have I misunderstood you?

    > >that makes no sense as there is no order in hashes so first can't
    > >exist. you want any() from quantum::superpositions or one of the
    > >perl6 modules.

    >
    > When will Perl6 be ready for prime time?


    Err... not for a while :). Perl5 will be the supported and developed
    version of Perl for the forseeable future. Some features of Perl6 are
    available for Perl5 in the modules in the Perl6::* namespace; any() is
    in Perl6::Junction (or, as Uri said, in Quantum::Superpositions, though
    that's likely much slower); also in List::MoreUtils, which is probably
    what I'd use if I needed it.

    Ben

    --
    For far more marvellous is the truth than any artists of the past imagined!
    Why do the poets of the present not speak of it? What men are poets who can
    speak of Jupiter if he were like a man, but if he is an immense spinning sphere
    of methane and ammonia must be silent? [Feynmann]
     
    Ben Morrow, Jun 16, 2006
    #18
  19. Shmuel (Seymour J.) Metz

    Uri Guttman Guest

    >>>>> "S(J)M" == Shmuel (Seymour J ) Metz <> writes:

    S(J)M> Whoops! I edited the text from the article instead of doing a
    S(J)M> cut-and-paste directly from my code. Make that

    always cut/paste real code here. otherwise you waste your and our time.

    >> that is very unclear to me.


    S(J)M> I'm extracting e-mail contacts from whois data. In some cases there
    S(J)M> are multiple contacts for the same role. The abuse contacts might have
    S(J)M> the word "abuse" in the addresses or might have it only in the tags.
    S(J)M> If there are abuse contacts then I want to put them in a message;
    S(J)M> otherwise want to put all of the e-mail in a different message. That's
    S(J)M> part of a larger program that deobfuscates a spa e-mail and attempts
    S(J)M> to locate information on the sender and the drop boxes for use in a
    S(J)M> complaint.

    again, you are somewhat unclear. 'put them in a message' means what? in
    the to: fields of email? in the body? 'all of the email' means what? all
    addresses in the whole whois record? only those with abuse in the
    address? only those emails in a section which mentions abuse? specifying
    clean problem requirements is the key to any solution.

    i smell an XY problem here. it is always best to explain the original
    problem than to ask how to solve it in the way you picked. just going
    back to the whois data may make this whole thing much easier. what is
    the format of the whois records?

    >> that makes no sense as there is no order in hashes so first can't
    >> exist. you want any() from quantum::superpositions or one of the
    >> perl6 modules.


    S(J)M> When will Perl6 be ready for prime time?

    there are perl6 modules on cpan which are written in perl5. look at the
    Perl6:: namespace.

    but i will wait until i see the whois stuff. solving your problem from
    that level looks like it will be much easier.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Jun 16, 2006
    #19
  20. Shmuel (Seymour J.) Metz <> wrote:
    > In <>, on 06/15/2006
    > at 12:57 PM, Tad McClellan <> said:
    >
    >>You flipped the lines from what you posted before,

    >
    > I edited the text of the article instead of clipping from my code,



    Yes, I could tell.

    Please follow the posting guidelines to avoid launching such
    red herrings in the future.


    > my $scalarContacts="@{$email_contact}";
    > push @abuseContacts, @{$email_contact}
    > if (/abuse/ or $scalarContacts =~ /abuse/);
    >
    >> if grep /abuse/, $_, @$emails;

    >
    > How is that better than scanning the derived scalar?



    1) you don't have do the deriving of any scalar (save a few cycles,
    both in silicon and in grey matter)

    2) it is not vulnerable to the bug that I pointed out earlier
    (That would be reason enough for me to not get used to doing
    it that way. Inserting bugs is bad.)

    3) you don't end up having the regex engine compile the same
    regex multiple times (save more than a few cycles)


    When I suggested using grep() in my very first followup, you
    dismissed it because it would "complicate the logic".

    push @abuseContacts, @{$email_contact}
    if grep /abuse/, $_, @{$email_contact};

    Those 2 lines are less complicated than your 3 lines quoted above.

    Using grep() in this situation *simplifies* the logic (and avoids
    the potential bug).


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jun 16, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    17
    Views:
    1,898
    Chris Uppal
    Nov 16, 2005
  2. sunny
    Replies:
    1
    Views:
    472
    Salt_Peter
    Dec 7, 2006
  3. Pallav singh
    Replies:
    0
    Views:
    382
    Pallav singh
    Jan 22, 2012
  4. Pallav singh
    Replies:
    0
    Views:
    411
    Pallav singh
    Jan 22, 2012
  5. Pallav singh
    Replies:
    1
    Views:
    463
    Peter Remmers
    Jan 22, 2012
Loading...

Share This Page