Re: More idiomatic

Discussion in 'Perl Misc' started by J. Gleixner, Apr 15, 2013.

  1. J. Gleixner

    J. Gleixner Guest

    On 04/15/13 13:17, David Harmon wrote:
    > Here is some code I wrote in another newsgroup. It is probably
    > obvious that it was written by an old C hacker. If anybody would
    > like to suggest how it could be better perl or more idiomatic,
    > go ahead.
    >
    >>>>

    >
    > use strict; use warnings;
    > open my $fin, 't1.txt';


    Use 3-argument open handle failure (die) if it can't open.
    perldoc -f open

    > my $fromline = '';
    > my $useragent = '';
    > my $hascont = 'n';
    > my $crosspost = 0;
    > my %statstab;



    >
    > while (<$fin>) {
    > chomp;
    > if (/Newsgroups: .*,/i) {
    > $crosspost = 1;
    > next;
    > }


    > if (/^Content-Type:/i) {
    > $hascont = 'y';
    > next;
    > }


    Even a C Hacker should have learned about using else...else if...

    Use elsif on the rest of these, since it can't start with more than one
    of these.. and avoids the need of 'next;' in every if. If none of these
    could also have 'Newsgroups: ' in it, then all of these, except for the
    first if, should be elsif's.

    perldoc perlintro

    > if (/^From: /i) {
    > $fromline = $_;
    > $fromline =~ s/<.*>/<...>/;
    > next;
    > }
    > if (/^User-Agent:|^X-News-Software:|^X-Newsreader:/i) {

    /^(?:User-Agent|X-News-Software|X-Newsreader):/i
    Tiny bit shorter..

    > $useragent = $_;
    > $useragent =~ s/ \(.*//;
    > $useragent =~ s/^.*:\s*//;
    > next;
    > }
    > if (/^$/) {
    > if ($fromline ne '' and not $crosspost) {
    > ++$statstab{$fromline.'##'.$useragent.'##'.$hascont};
    > }
    > $fromline = $useragent = '';
    > $crosspost = 0; $hascont = 'n';
    > }
    > }

    close( $fin );
    >
    > foreach (sort keys %statstab) {

    next unless statstab{$_} > 1;

    > if ( $statstab{$_}> 1) {
    > ($fromline,$useragent,$hascont) = split '##';


    Since they're used specifically as a result of split, in
    this loop, then define them ('my') in this scope instead
    of re-using the global variables. It works, in this case,
    however it's better to narrow the scope of the variables.

    > printf("%s %-24s %3d %s\n",
    > $hascont, $useragent, $statstab{$_}, $fromline);
    > }
    > }
    J. Gleixner, Apr 15, 2013
    #1
    1. Advertising

  2. "J. Gleixner" <> writes:
    > On 04/15/13 13:17, David Harmon wrote:


    [...]

    >> while (<$fin>) {
    >> chomp;
    >> if (/Newsgroups: .*,/i) {
    >> $crosspost = 1;
    >> next;
    >> }

    >
    >> if (/^Content-Type:/i) {
    >> $hascont = 'y';
    >> next;
    >> }

    >
    > Even a C Hacker should have learned about using else...else if...


    This doesn't make much sense here because there is no common code
    after the sequence of if-statements. The 'next' communicates this
    clearly. Using an if - elsif - elsif - else cascade instead would
    server to obfuscate this fact instead.

    [...]

    >> foreach (sort keys %statstab) {

    > next unless statstab{$_} > 1;
    >
    >> if ( $statstab{$_}> 1) {
    >> ($fromline,$useragent,$hascont) = split '##';

    >
    > Since they're used specifically as a result of split, in
    > this loop, then define them ('my') in this scope instead
    > of re-using the global variables. It works, in this case,
    > however it's better to narrow the scope of the variables.


    It is not 'better' to declare identically named variables in bazillion
    inner scopes except if one if optimizing for 'maximal outsider
    confusion' aka 'It was hard to write. It should be hard to read."
    Rainer Weikusat, Apr 15, 2013
    #2
    1. Advertising

  3. J. Gleixner

    C.DeRykus Guest

    On Monday, April 15, 2013 12:24:25 PM UTC-7, Rainer Weikusat wrote:
    > "J. Gleixner" <> writes:
    >
    > > On 04/15/13 13:17, David Harmon wrote:

    >
    >
    >
    > [...]
    >
    >
    >
    > >> while (<$fin>) {

    >
    > >> chomp;

    >
    > >> if (/Newsgroups: .*,/i) {

    >
    > >> $crosspost = 1;

    >
    > >> next;

    >
    > >> }

    >
    > >

    >
    > >> if (/^Content-Type:/i) {

    >
    > >> $hascont = 'y';

    >
    > >> next;

    >
    > >> }

    >
    > >

    >
    > > Even a C Hacker should have learned about using else...else if...

    >
    >
    >
    > This doesn't make much sense here because there is no common code
    >
    > after the sequence of if-statements. The 'next' communicates this
    >
    > clearly. Using an if - elsif - elsif - else cascade instead would
    >
    > server to obfuscate this fact instead.
    >
    >


    But similarly you could argue "next" is confusing
    because it suggests that you need to bypass common
    code at the end of the conditionals. Since there
    isn't any common code, "next" is unnecessary and
    adds line noise. Of course, if it's a huge sequence
    of conditionals, you could argue that the added
    clarity trumps line noise.

    > ...


    --
    Charles DeRykus
    C.DeRykus, Apr 15, 2013
    #3
  4. "C.DeRykus" <> writes:
    > On Monday, April 15, 2013 12:24:25 PM UTC-7, Rainer Weikusat wrote:
    >> "J. Gleixner" <> writes:
    >> > On 04/15/13 13:17, David Harmon wrote:


    [...]


    >> >> if (/Newsgroups: .*,/i) {
    >> >> $crosspost = 1;
    >> >> next;
    >> >> }


    [...]

    >> > Even a C Hacker should have learned about using else...else if...

    >>
    >> This doesn't make much sense here because there is no common code
    >> after the sequence of if-statements. The 'next' communicates this
    >> clearly. Using an if - elsif - elsif - else cascade instead would
    >> server to obfuscate this fact instead.

    >
    > But similarly you could argue "next" is confusing
    > because it suggests that you need to bypass common
    > code at the end of the conditionals.


    There is 'common code' after each if block, namely, all the remaining if
    conditions which have no special relation to each other and 'next' has
    no special relation to them.
    Rainer Weikusat, Apr 15, 2013
    #4
  5. J. Gleixner

    C.DeRykus Guest

    On Monday, April 15, 2013 1:59:30 PM UTC-7, Rainer Weikusat wrote:
    > "C.DeRykus" <> writes:
    >
    > > On Monday, April 15, 2013 12:24:25 PM UTC-7, Rainer Weikusat wrote:

    >
    > >> "J. Gleixner" <> writes:

    >
    > >> > On 04/15/13 13:17, David Harmon wrote:

    >
    >
    >
    > [...]
    >
    >
    >
    >
    >
    > >> >> if (/Newsgroups: .*,/i) {

    >
    > >> >> $crosspost = 1;

    >
    > >> >> next;

    >
    > >> >> }

    >
    >
    >
    > [...]
    >
    >
    >
    > >> > Even a C Hacker should have learned about using else...else if...

    >
    > >>

    >
    > >> This doesn't make much sense here because there is no common code

    >
    > >> after the sequence of if-statements. The 'next' communicates this

    >
    > >> clearly. Using an if - elsif - elsif - else cascade instead would

    >
    > >> server to obfuscate this fact instead.

    >
    > >

    >
    > > But similarly you could argue "next" is confusing

    >
    > > because it suggests that you need to bypass common

    >
    > > code at the end of the conditionals.

    >
    >
    >
    > There is 'common code' after each if block, namely, all the remaining if
    >
    > conditions which have no special relation to each other and 'next' has
    >
    > no special relation to them.



    But, there really didn't need to be any common code
    since each "if" clause was accompanied with a "next".

    So there's no compelling reason to not code this as
    "if-elsif..." and avoid the noisy "next" insertions.

    --
    Charles DeRykus
    C.DeRykus, Apr 15, 2013
    #5
  6. "C.DeRykus" <> writes:
    > On Monday, April 15, 2013 1:59:30 PM UTC-7, Rainer Weikusat wrote:


    [...]

    >> There is 'common code' after each if block, namely, all the remaining if
    >> conditions which have no special relation to each other and 'next' has
    >> no special relation to them.

    >
    > But, there really didn't need to be any common code
    > since each "if" clause was accompanied with a "next".
    >
    > So there's no compelling reason to not code this as
    > "if-elsif..." and avoid the noisy "next" insertions.


    The compelling reason is that execution continues with the next
    statement after the if-elsif... cascade in case of such a cascade. But
    when there is not such 'next statement', the whole construct is just
    a blind: It forces someone who is not familiar with the code to go
    looking for such common code only to find none.
    Rainer Weikusat, Apr 15, 2013
    #6
  7. J. Gleixner

    J. Gleixner Guest

    On 04/15/13 14:55, David Harmon wrote:
    > On Mon, 15 Apr 2013 14:04:32 -0500 in comp.lang.perl.misc, "J.
    > Gleixner"<> wrote,

    [...]
    >> Use elsif on the rest of these, since it can't start with more than one
    >> of these.. and avoids the need of 'next;' in every if.

    >
    > In my opinion 'next;' is better in this case because 'else if'
    > requires the reader to look below to see if anything else may happen
    > before the end of the loop.


    You asked for "idiomatic" suggestions. I rarely see code that follows
    your "opinion", but it's not code I'll have to fix, so do whatever
    you want.

    There are times to use if/next, however using it in every
    if block is not idiomatic.

    >
    >>> if (/^User-Agent:|^X-News-Software:|^X-Newsreader:/i) {

    >> /^(?:User-Agent|X-News-Software|X-Newsreader):/i
    >> Tiny bit shorter..

    >
    > Actually, it is *exactly* the same number of characters.
    > And requires remembering (?:


    Knowing (?:) is idiomatic to Perl.

    It was more for showing how you could avoid repetition... Three
    carats, three colons... If you need to add another one, it won't
    require those common characters, and it's clear that whatever it is
    you are looking for starts from the beginning.. If the match
    can be from the beginning, or at any place, then it might be
    coded differently, but it this case they all start from the
    beginning, so do it once.

    >
    > What with the added grouping, does it make the regex any more
    > efficient?


    No grouping.

    >
    >>> }

    >> close( $fin );

    >
    > Does that matter?


    Again, it's idiomatic.. For ever open, there should be a close.
    That's common in most languages. Or 'use autodie;'.

    >
    >>> foreach (sort keys %statstab) {

    >> next unless statstab{$_}> 1;

    >
    > Here you give me the opposite advice regarding 'next;' as you did
    > above.


    Not at all related.

    There are a lot of times to use 'next', and this is one of many.

    You'd probably see this from a person who is well versed in Perl
    and you'd see the if() with folks who aren't as comfortable, which
    again would be idiomatic.
    J. Gleixner, Apr 15, 2013
    #7
  8. J. Gleixner

    C.DeRykus Guest

    On Monday, April 15, 2013 2:24:17 PM UTC-7, Rainer Weikusat wrote:
    > "C.DeRykus" <> writes:
    >
    > > On Monday, April 15, 2013 1:59:30 PM UTC-7, Rainer Weikusat wrote:

    >
    >
    >
    > [...]
    >
    >
    >
    > >> There is 'common code' after each if block, namely, all the remaining if

    >
    > >> conditions which have no special relation to each other and 'next' has

    >
    > >> no special relation to them.

    >
    > >

    >
    > > But, there really didn't need to be any common code

    >
    > > since each "if" clause was accompanied with a "next".

    >
    > >

    >
    > > So there's no compelling reason to not code this as

    >
    > > "if-elsif..." and avoid the noisy "next" insertions.

    >
    >
    >
    > The compelling reason is that execution continues with the next
    >
    > statement after the if-elsif... cascade in case of such a cascade. But
    >
    > when there is not such 'next statement', the whole construct is just
    >
    > a blind: It forces someone who is not familiar with the code to go
    >
    > looking for such common code only to find none.


    And the multiple "if...next" code forces you to read
    through them all anyway to determine if there's common
    code as well. After all, at some point, an "if..." may
    not have a "next". You'll be schleppen through lots of
    noisy "next" statements to ensure that's the case.

    There's no "One True Way". Stylistically, I'd prefer
    the "if...elsif" as the more idiomatic way but TIMTODI
    is the "next" best option.

    --
    Charles DeRykus
    C.DeRykus, Apr 15, 2013
    #8
  9. "J. Gleixner" <> writes:
    > On 04/15/13 14:55, David Harmon wrote:
    >> On Mon, 15 Apr 2013 14:04:32 -0500 in comp.lang.perl.misc, "J.
    >> Gleixner"<> wrote,


    [...]

    >>>> }
    >>> close( $fin );

    >>
    >> Does that matter?

    >
    > Again, it's idiomatic..


    According to

    http://oald8.oxfordlearnersdictionaries.com/dictionary/idiomatic

    'idiomatic' means

    containing expressions that are natural to a native speaker of
    a language

    It is somewhat unclear what this is supposed to mean in the context of
    a programming language but it certainly doesn't mean "include useless
    code because 'everyone else does it'". Perl supports proper stack
    unwinding and because of this, explicitly closing filehandles stored
    in lexical variables is never necessary. The perl runtime will also
    close all filehandles before the script exists.
    Rainer Weikusat, Apr 16, 2013
    #9
  10. "C.DeRykus" <> writes:
    > On Monday, April 15, 2013 2:24:17 PM UTC-7, Rainer Weikusat wrote:
    >> "C.DeRykus" <> writes:


    [...]

    >> The compelling reason is that execution continues with the next
    >> statement after the if-elsif... cascade in case of such a cascade. But
    >> when there is not such 'next statement', the whole construct is just
    >> a blind: It forces someone who is not familiar with the code to go
    >> looking for such common code only to find none.

    >
    > And the multiple "if...next" code forces you to read
    > through them all anyway to determine if there's common
    > code as well. After all, at some point, an "if..." may
    > not have a "next". You'll be schleppen through lots of
    > noisy "next" statements to ensure that's the case.


    I've meanwhile managed to understand what you were writing about:
    Considering the way the posted code was written, the only 'thing' the
    next statements actually skipped where the subsequent if tests and the
    only reason why 'subsequent if tests' exist at all on this code path
    is because if - elsif ... else wasn't used.

    But that's IMHO besides the point: The terminating next marks each
    if-block as 'self-contained' similar to the way the 'break' statement
    works in a switch(..) { } in C: For this case, no further code within
    the current block will be executed. And this is different from if
    ..... which denotes a set of alternate 'sibling sub-blocks' with the
    current block. One could visualize this as something like

    ------
    | stmt |
    ------

    ------
    | stmt |
    ------

    ------- ------- -------
    | if | | elsif | | else |
    ------- ------- -------

    ------
    | stmt |
    ------

    There's a 'point' after the if .. -cascade where all the alternate
    codepaths end up and if there is nothing at this point, the if ... is
    misleading.

    > There's no "One True Way".


    It is possible to argue about objective properties of the various
    alternatives and to form a reasoned opinion based on that.
    Rainer Weikusat, Apr 16, 2013
    #10
  11. In article <516c4f40$0$73617$>,
    J. Gleixner <> wrote:
    > On 04/15/13 13:17, David Harmon wrote:
    > > Here is some code I wrote in another newsgroup. It is probably
    > > obvious that it was written by an old C hacker. If anybody would
    > > like to suggest how it could be better perl or more idiomatic,
    > > go ahead.
    > >
    > >>>>

    > >
    > > use strict; use warnings;
    > > open my $fin, 't1.txt';

    >
    > Use 3-argument open handle failure (die) if it can't open.
    > perldoc -f open
    >
    > > my $fromline = '';
    > > my $useragent = '';
    > > my $hascont = 'n';
    > > my $crosspost = 0;
    > > my %statstab;

    >
    >
    > >
    > > while (<$fin>) {
    > > chomp;
    > > if (/Newsgroups: .*,/i) {
    > > $crosspost = 1;
    > > next;
    > > }

    >
    > > if (/^Content-Type:/i) {
    > > $hascont = 'y';
    > > next;
    > > }

    >
    > Even a C Hacker should have learned about using else...else if...


    When appropriate :)

    > Use elsif on the rest of these, since it can't start with more than one
    > of these.. and avoids the need of 'next;' in every if. If none of these
    > could also have 'Newsgroups: ' in it, then all of these, except for the
    > first if, should be elsif's.


    I disagree. To me, the use of "next;" or, in C, "continue;", is an explicit
    and succinct way of saying "At this point, I'm done with this iteration".
    It unambiguously tells the reader of the code that for this particular
    case, there is nothing happening further down within the loop.

    I find that MUCH clearer than having to read down through a long chain
    of if/elseif statements.

    Similarly for break and early return.

    Cheers
    Tony
    --
    Tony Mountifield
    Work: - http://www.softins.co.uk
    Play: - http://tony.mountifield.org
    Tony Mountifield, Apr 16, 2013
    #11
  12. musings about 'idiomatic' (was: More idiomatic)

    Rainer Weikusat <> writes:
    > "J. Gleixner" <> writes:
    >> On 04/15/13 14:55, David Harmon wrote:
    >>> On Mon, 15 Apr 2013 14:04:32 -0500 in comp.lang.perl.misc, "J.
    >>> Gleixner"<> wrote,

    >
    > [...]
    >
    >>>>> }
    >>>> close( $fin );
    >>>
    >>> Does that matter?

    >>
    >> Again, it's idiomatic..

    >
    > According to
    >
    > http://oald8.oxfordlearnersdictionaries.com/dictionary/idiomatic
    >
    > 'idiomatic' means
    >
    > containing expressions that are natural to a native speaker of
    > a language
    >
    > It is somewhat unclear what this is supposed to mean in the context of
    > a programming language


    The definition above could be paraphrased as 'using the language to
    express a circumstance in a specific way which is familiar to people
    with the same cultural background regarding language use as the
    speaker but which can't be inferred from the rules of the language
    itself'. The term also has a positive connotation. For instance "it
    fell through the cracks" would be an idiom in English but - despite
    the fact that listening to some 'native speakers' can leave one with
    the impression that English contains only one articulated word,
    namely, '****', and a small set of auxiliary grunts supposed to mark
    the topic of the conversation relative to '****', provided it is
    differentiated enough that such an intellectual endeavour makes any
    sense - the conventional barrage of fucks and grunts one can easily
    encounter in an English pub wouldn't be considered 'idiomatic use of
    English'. In line with this observation, the so-called 'Schwartzian
    transform' could be considered 'a Perl idiom', as could the be

    use warnings;
    use strict;

    at the beginnig of every file but neither the mechanic insertion of
    useless statements (like close($fin)) nor shunning all control flow
    controlling constructs in favor of if - elsif - else ... would
    qualify. The former suggest that the author's main cultural background
    is actually something other than Perl, eg, Java, where the runtime
    can't manage anything except memory automatically and the latter is
    similar to the colloquial use of '****' as universal adjective and
    near-universal verb in some form of 'good enough for me'
    English. There's even a term for 'restricted language subsets' of this
    kind, although it is not usually used for native speakers, namely,
    'pidgin English' or 'pidgin Perl' for the if ... case. Especially
    considering that this construct isn't specific to Perl at all but of
    similar universality in 'pidgin C' and - presumably - almost all other
    languages with support for structured programming.
    Rainer Weikusat, Apr 16, 2013
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Zed A. Shaw
    Replies:
    2
    Views:
    106
    Zed A. Shaw
    Apr 23, 2005
  2. Charles Calvert
    Replies:
    33
    Views:
    348
    w_a_x_man
    Oct 22, 2010
  3. Alan Mead
    Replies:
    4
    Views:
    137
    Alan Mead
    Feb 12, 2005
  4. Rainer Weikusat

    Re: More idiomatic

    Rainer Weikusat, Apr 15, 2013, in forum: Perl Misc
    Replies:
    1
    Views:
    150
    Rainer Weikusat
    Apr 15, 2013
  5. Tim McDaniel

    Re: More idiomatic

    Tim McDaniel, Apr 15, 2013, in forum: Perl Misc
    Replies:
    3
    Views:
    160
    Keith Thompson
    Apr 16, 2013
Loading...

Share This Page