Regular Expression Question on /i

Discussion in 'Perl Misc' started by amyl@paxemail.com, Jan 29, 2005.

  1. Guest

    I was playing around with some regular expressions in Perl and I
    noticed some of them have the following character in them "/i".

    For example:
    \b(?:college|university)\s+diplomas/i

    I can't seem to find any info on what /i is. Can anyone shed some
    light on /i?

    Thanks
    Amy.
    , Jan 29, 2005
    #1
    1. Advertising

  2. ddog Guest

    On 28 Jan 2005, the Good made the following observation
    in comp.lang.perl.misc:

    > I was playing around with some regular expressions in Perl and I
    > noticed some of them have the following character in them "/i".
    >
    > For example:
    > \b(?:college|university)\s+diplomas/i
    >
    > I can't seem to find any info on what /i is. Can anyone shed some
    > light on /i?
    >
    > Thanks
    > Amy.
    >
    >


    Case-insensitive operator. /Dummytext/ matches exactly "Dummytext" but
    /Dummytext/i matches "DUMMYTEXT, DuMmYTEXT,..." and so on.

    --
    ddog, Jan 29, 2005
    #2
    1. Advertising

  3. wrote:
    > I was playing around with some regular expressions in Perl and I
    > noticed some of them have the following character in them "/i".
    >
    > For example:
    > \b(?:college|university)\s+diplomas/i
    >
    > I can't seem to find any info on what /i is. Can anyone shed some
    > light on /i?


    perldoc perlre:
    i Do case-insensitive pattern matching.

    If "use locale" is in effect, the case map is taken from the current
    locale. See the perllocale manpage.

    jue
    Jürgen Exner, Jan 29, 2005
    #3
  4. <> wrote:

    > I was playing around with some regular expressions in Perl and I
    > noticed some of them have the following character in them "/i".

    ^^^^^^^
    ^^^^^^^
    No they don't.

    (shouldn't that be "following characters" since there is more
    than one of them?
    )


    > For example:
    > \b(?:college|university)\s+diplomas/i



    The "i" is not in a regular expression there, the "i" is part
    of an *operator* that makes use of regular expressions.

    You show the ending delimiter but not the corresponding opening one:

    /\b(?:college|university)\s+diplomas/i
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    The underlined part, between the slashes, is a regular expression.

    The match operator takes a regular expression as one of its operands.


    > I can't seem to find any info on what /i is.



    Where did you look?

    (maybe you are missing out on some helpful references...)


    > Can anyone shed some
    > light on /i?



    Now that we know that we have a question about an operator
    rather than about a regular expression, we look it up where
    we look up all of Perl's operators:

    perldoc perlop

    ...
    i Do case-insensitive pattern matching.
    ...


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Jan 29, 2005
    #4
  5. Guest

    Thank you so much. I had no idea about "perldoc".

    Amy
    , Jan 29, 2005
    #5
  6. [A complimentary Cc of this posting was sent to
    Tad McClellan
    <>], who wrote in article <>:
    > > \b(?:college|university)\s+diplomas/i

    >
    >
    > The "i" is not in a regular expression there, the "i" is part
    > of an *operator* that makes use of regular expressions.


    Actually, it is in a regular expression; so are 'x', 's' and 'm'.

    Hope this helps,
    Ilya
    Ilya Zakharevich, Jan 29, 2005
    #6
  7. <> wrote:

    > Thank you so much.



    Thank _who_ so much?

    It is customary to quote some context when composing a followp.

    Have you seen the Posting Guidelines that are posted here frequently?


    > I had no idea about "perldoc".



    Have you seen the Posting Guidelines that are posted here frequently?


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Jan 29, 2005
    #7
  8. Anno Siegel Guest

    Ilya Zakharevich <> wrote in comp.lang.perl.misc:
    > [A complimentary Cc of this posting was sent to
    > Tad McClellan
    > <>], who wrote in article
    > <>:
    > > > \b(?:college|university)\s+diplomas/i

    > >
    > >
    > > The "i" is not in a regular expression there, the "i" is part
    > > of an *operator* that makes use of regular expressions.

    >
    > Actually, it is in a regular expression; so are 'x', 's' and 'm'.


    Hmmm? I'm not following.

    You must mean (?imsx), which *can* indeed be embedded in a regular
    expression. If they are given in the form m/.../imsx however, they
    aren't (even if the regex compiler doesn't make a difference).

    Or do you mean something else?

    Anno
    Anno Siegel, Jan 30, 2005
    #8
  9. [A complimentary Cc of this posting was sent to
    Anno Siegel
    <-berlin.de>], who wrote in article <ctjdj1$h4n$-Berlin.DE>:
    > > > The "i" is not in a regular expression there, the "i" is part
    > > > of an *operator* that makes use of regular expressions.

    > >
    > > Actually, it is in a regular expression; so are 'x', 's' and 'm'.


    > You must mean (?imsx), which *can* indeed be embedded in a regular
    > expression.


    I did not.

    > If they are given in the form m/.../imsx however, they
    > aren't.


    Yes, they are. ;-)

    It is unfortunate, that some of modifiers are actually parts of a REx,
    while the others are part of the OP which uses the REx, but this is a
    legacy design...

    Hope this helps,
    Ilya
    Ilya Zakharevich, Jan 30, 2005
    #9
  10. Anno Siegel Guest

    Ilya Zakharevich <> wrote in comp.lang.perl.misc:
    > [A complimentary Cc of this posting was sent to
    > Anno Siegel
    > <-berlin.de>], who wrote in article
    > <ctjdj1$h4n$-Berlin.DE>:
    > > > > The "i" is not in a regular expression there, the "i" is part
    > > > > of an *operator* that makes use of regular expressions.
    > > >
    > > > Actually, it is in a regular expression; so are 'x', 's' and 'm'.

    >
    > > You must mean (?imsx), which *can* indeed be embedded in a regular
    > > expression.

    >
    > I did not.
    >
    > > If they are given in the form m/.../imsx however, they
    > > aren't.

    >
    > Yes, they are. ;-)
    >
    > It is unfortunate, that some of modifiers are actually parts of a REx,
    > while the others are part of the OP which uses the REx, but this is a
    > legacy design...


    Please, in what *sense* are they part of the regex? Textually they aren't.

    Anno
    Anno Siegel, Jan 30, 2005
    #10
  11. Anno Siegel Guest

    Ilya Zakharevich <> wrote in comp.lang.perl.misc:
    > [A complimentary Cc of this posting was sent to
    > Anno Siegel
    > <-berlin.de>], who wrote in article
    > <ctjdj1$h4n$-Berlin.DE>:
    > > > > The "i" is not in a regular expression there, the "i" is part
    > > > > of an *operator* that makes use of regular expressions.
    > > >
    > > > Actually, it is in a regular expression; so are 'x', 's' and 'm'.

    >
    > > You must mean (?imsx), which *can* indeed be embedded in a regular
    > > expression.

    >
    > I did not.
    >
    > > If they are given in the form m/.../imsx however, they
    > > aren't.

    >
    > Yes, they are. ;-)
    >
    > It is unfortunate, that some of modifiers are actually parts of a REx,
    > while the others are part of the OP which uses the REx, but this is a
    > legacy design...


    I know better than to accuse you of making no sense, but please, in what
    *sense* are they part of the regex? Textually they aren't. That is what
    the user sees.

    Anno
    Anno Siegel, Jan 30, 2005
    #11
  12. [A complimentary Cc of this posting was sent to
    Anno Siegel
    <-berlin.de>], who wrote in article <ctjj64$l7p$-Berlin.DE>:
    > > It is unfortunate, that some of modifiers are actually parts of a REx,
    > > while the others are part of the OP which uses the REx, but this is a
    > > legacy design...


    > I know better than to accuse you of making no sense, but please, in what
    > *sense* are they part of the regex? Textually they aren't. That is what
    > the user sees.


    Assume that separation into "the REx" and "the operator which uses the
    REx" makes some sense (one could say: it is just a script, and I do not
    care what happens inside this script; it would be hard to discuss this
    topic with such a person, right? And in some situations this point of
    view is very productive...).

    If we draw the separation line according to implementation, then the
    modifiers 's', 'm', 'i', 'x' modify the match REx, while other
    modifiers modify the operator which will do the match. Of course,
    infinitely many other ways to pass this line may be chosen; however,
    one I choose is the only one which has some pseudo-objective ground. ;-)

    If you think so straight as you pretend ;-), try to spell out where
    this 'm' belongs to in this example:

    print for split m(foo), qw(bar), -1;

    Yours,
    Ilya
    Ilya Zakharevich, Jan 31, 2005
    #12
  13. Anno Siegel Guest

    Ilya Zakharevich <> wrote in comp.lang.perl.misc:
    > [A complimentary Cc of this posting was sent to
    > Anno Siegel
    > <-berlin.de>], who wrote in article
    > <ctjj64$l7p$-Berlin.DE>:
    > > > It is unfortunate, that some of modifiers are actually parts of a REx,
    > > > while the others are part of the OP which uses the REx, but this is a
    > > > legacy design...

    >
    > > I know better than to accuse you of making no sense, but please, in what
    > > *sense* are they part of the regex? Textually they aren't. That is what
    > > the user sees.

    >
    > Assume that separation into "the REx" and "the operator which uses the
    > REx" makes some sense (one could say: it is just a script, and I do not
    > care what happens inside this script; it would be hard to discuss this
    > topic with such a person, right? And in some situations this point of
    > view is very productive...).


    Insightful, yes. Productivity depends on what you want to produce. In
    the context of clpm, I'm interested in a terminology that is useful in
    discussing Perl programs. As such, its distinctions should be as obvious
    as possible. It is useful to distinguish the regex proper from the
    operators that use it (m//, s///, qr//). At the surface of things,
    the modifiers are clearly part of the operators, not only textually, but
    also because the operator determines which modifiers are possible.

    I think, for everyday discourse it is wisest to go with that concept.
    A terminology that carefully distinguishes between modifiers that are
    part of the regex and others that truly belong to the operator may
    carry deeper insight, but it is impractical.

    > If we draw the separation line according to implementation, then the
    > modifiers 's', 'm', 'i', 'x' modify the match REx, while other


    /g also modifies the match in scalar context.

    > modifiers modify the operator which will do the match. Of course,
    > infinitely many other ways to pass this line may be chosen; however,
    > one I choose is the only one which has some pseudo-objective ground. ;-)


    I'll rather stick to the hard facts of syntax, even if they lie about
    the deeper reality of implementation. Taken to an extreme, we couldn't
    discuss a program without first de-parsing it. That is not a direction
    I want to go.

    > If you think so straight as you pretend ;-), try to spell out where
    > this 'm' belongs to in this example:
    >
    > print for split m(foo), qw(bar), -1;


    Now you've lost me again. What has the role of 'm' in the split expression
    to do with the distinctions among the /gmisox... modifiers?

    Of course there is something strange about the first parameter of split(),
    it is not evaluated like a function parameter would (and split() doesn't
    have a prototype). As far as I'm concerned, that's a peculiarity of the
    syntax introduced by the keyword "split" and has little to do with regular
    expressions and their operators.

    Anno
    Anno Siegel, Jan 31, 2005
    #13
  14. [A complimentary Cc of this posting was sent to
    Anno Siegel
    <-berlin.de>], who wrote in article <ctm49k$a63$-Berlin.DE>:
    > as possible. It is useful to distinguish the regex proper from the
    > operators that use it (m//, s///, qr//). At the surface of things,
    > the modifiers are clearly part of the operators, not only textually, but
    > also because the operator determines which modifiers are possible.


    Aha, you see: the stuff which "is in REx" is always possible, since it
    is has no relation to operator. ;-)

    > I think, for everyday discourse it is wisest to go with that concept.
    > A terminology that carefully distinguishes between modifiers that are
    > part of the regex and others that truly belong to the operator may
    > carry deeper insight, but it is impractical.


    I do not thing one can put practical/impractical under one hat. There
    may be many situations where I would not mention that these guys are
    parts of a REx (TEACHING is CHEATING). However, IMO c.l.p.m is mature
    enough to eat this and profit from it.

    > > If we draw the separation line according to implementation, then the
    > > modifiers 's', 'm', 'i', 'x' modify the match REx, while other


    > /g also modifies the match in scalar context.


    AFAIK, /g does not modify the REx. All it does is "try to match this
    REx at larger offsets if it succeeds".

    > I'll rather stick to the hard facts of syntax, even if they lie about
    > the deeper reality of implementation. Taken to an extreme, we couldn't
    > discuss a program without first de-parsing it. That is not a direction
    > I want to go.


    Well, if you do not like Perl, you know where you need to go. I tried
    to decrease this problem with CPerl, but not everybody uses it.
    Unless you know where is closing char of a REx, you cannot seriously
    discuss its work...

    > > If you think so straight as you pretend ;-), try to spell out where
    > > this 'm' belongs to in this example:
    > >
    > > print for split m(foo), qw(bar), -1;

    >
    > Now you've lost me again. What has the role of 'm' in the split expression
    > to do with the distinctions among the /gmisox... modifiers?


    You want a simple non-ambiguous answer in one situation. I just show
    a similar situation with similar lack of an obvious answer.

    I trust that we understand each other's POVs enough, right? Is it
    productive to continue this discussion? If not, maybe let it rest on
    Google? ;-)

    Yours,
    Ilya
    Ilya Zakharevich, Jan 31, 2005
    #14
  15. Anno Siegel Guest

    Ilya Zakharevich <> wrote in comp.lang.perl.misc:

    > I trust that we understand each other's POVs enough, right? Is it
    > productive to continue this discussion? If not, maybe let it rest on
    > Google? ;-)


    Very well.

    Anno
    Anno Siegel, Jan 31, 2005
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Andrew Munn

    Regular expression question...

    Andrew Munn, Jun 29, 2003, in forum: Perl
    Replies:
    1
    Views:
    2,109
    rakesh sharma
    Jun 30, 2003
  2. Glenn Kidd

    Regular expression question

    Glenn Kidd, Aug 18, 2003, in forum: Perl
    Replies:
    0
    Views:
    921
    Glenn Kidd
    Aug 18, 2003
  3. VSK
    Replies:
    2
    Views:
    2,272
  4. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    832
    Alan Moore
    Dec 2, 2005
  5. GIMME
    Replies:
    3
    Views:
    11,924
    vforvikash
    Dec 29, 2008
Loading...

Share This Page