Pattern Search Regular Expression

Discussion in 'Python' started by subhabangalore@gmail.com, Jun 15, 2013.

  1. Guest

    Dear Group,

    I am trying to search the following pattern in Python.

    I have following strings:

    (i)"In the ocean"
    (ii)"On the ocean"
    (iii) "By the ocean"
    (iv) "In this group"
    (v) "In this group"
    (vi) "By the new group"
    .....

    I want to extract from the first word to the last word,
    where first word and last word are varying.

    I am looking to extract out:
    (i) the
    (ii) the
    (iii) the
    (iv) this
    (v) this
    (vi) the new
    .....

    The problem may be handled by converting the string to list and then
    index of list.

    But I am thinking if I can use regular expression in Python.

    If any one of the esteemed members can help.

    Thanking you in Advance,

    Regards,
    Subhabrata
     
    , Jun 15, 2013
    #1
    1. Advertising

  2. On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote:

    > Dear Group,
    >
    > I am trying to search the following pattern in Python.
    >
    > I have following strings:
    >
    > (i)"In the ocean"
    > (ii)"On the ocean"
    > (iii) "By the ocean"
    > (iv) "In this group"
    > (v) "In this group"
    > (vi) "By the new group"
    > .....
    >
    > I want to extract from the first word to the last word, where first word
    > and last word are varying.
    >
    > I am looking to extract out:
    > (i) the
    > (ii) the
    > (iii) the
    > (iv) this
    > (v) this
    > (vi) the new
    > .....
    >
    > The problem may be handled by converting the string to list and then
    > index of list.


    No need for a regular expression.


    py> sentence = "By the new group"
    py> words = sentence.split()
    py> words[1:-1]
    ['the', 'new']

    Does that help?



    --
    Steven
     
    Steven D'Aprano, Jun 15, 2013
    #2
    1. Advertising

  3. On 15/06/2013 10:42, wrote:
    > Dear Group,
    >
    > I am trying to search the following pattern in Python.
    >
    > I have following strings:
    >
    > (i)"In the ocean"
    > (ii)"On the ocean"
    > (iii) "By the ocean"
    > (iv) "In this group"
    > (v) "In this group"
    > (vi) "By the new group"
    > .....
    >
    > I want to extract from the first word to the last word,
    > where first word and last word are varying.
    >
    > I am looking to extract out:
    > (i) the
    > (ii) the
    > (iii) the
    > (iv) this
    > (v) this
    > (vi) the new
    > .....
    >
    > The problem may be handled by converting the string to list and then
    > index of list.
    >
    > But I am thinking if I can use regular expression in Python.
    >
    > If any one of the esteemed members can help.
    >
    > Thanking you in Advance,
    >
    > Regards,
    > Subhabrata
    >


    I tend to reach for string methods rather than an RE so will something
    like this suit you?

    c:\Users\Mark\MyPython>type a.py
    for s in ("In the ocean",
    "On the ocean",
    "By the ocean",
    "In this group",
    "In this group",
    "By the new group"):
    print(' '.join(s.split()[1:-1]))


    c:\Users\Mark\MyPython>a
    the
    the
    the
    this
    this
    the new

    --
    "Steve is going for the pink ball - and for those of you who are
    watching in black and white, the pink is next to the green." Snooker
    commentator 'Whispering' Ted Lowe.

    Mark Lawrence
     
    Mark Lawrence, Jun 15, 2013
    #3
  4. On Sat, 15 Jun 2013 10:05:01 +0000, Steven D'Aprano wrote:

    > On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote:
    >
    >> Dear Group,
    >>
    >> I am trying to search the following pattern in Python.
    >>
    >> I have following strings:
    >>
    >> (i)"In the ocean" (ii)"On the ocean" (iii) "By the ocean" (iv) "In
    >> this group" (v) "In this group" (vi) "By the new group"
    >> .....
    >>
    >> I want to extract from the first word to the last word, where first
    >> word and last word are varying.
    >>
    >> I am looking to extract out:
    >> (i) the (ii) the (iii) the (iv) this (v) this (vi) the new
    >> .....
    >>
    >> The problem may be handled by converting the string to list and then
    >> index of list.

    >
    > No need for a regular expression.
    >
    > py> sentence = "By the new group"
    > py> words = sentence.split()
    > py> words[1:-1]
    > ['the', 'new']
    >
    > Does that help?


    I thought OP wanted:

    words[words[0],words[-1]]

    But that might be just my caffeine deprived misinterpretation of his
    terminology.

    --
    Denis McMahon,
     
    Denis McMahon, Jun 15, 2013
    #4
  5. On 15/06/2013 11:24, Denis McMahon wrote:
    > On Sat, 15 Jun 2013 10:05:01 +0000, Steven D'Aprano wrote:
    >
    >> On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote:
    >>
    >>> Dear Group,
    >>>
    >>> I am trying to search the following pattern in Python.
    >>>
    >>> I have following strings:
    >>>
    >>> (i)"In the ocean" (ii)"On the ocean" (iii) "By the ocean" (iv) "In
    >>> this group" (v) "In this group" (vi) "By the new group"
    >>> .....
    >>>
    >>> I want to extract from the first word to the last word, where first
    >>> word and last word are varying.
    >>>
    >>> I am looking to extract out:
    >>> (i) the (ii) the (iii) the (iv) this (v) this (vi) the new
    >>> .....
    >>>
    >>> The problem may be handled by converting the string to list and then
    >>> index of list.

    >>
    >> No need for a regular expression.
    >>
    >> py> sentence = "By the new group"
    >> py> words = sentence.split()
    >> py> words[1:-1]
    >> ['the', 'new']
    >>
    >> Does that help?

    >
    > I thought OP wanted:
    >
    > words[words[0],words[-1]]
    >
    > But that might be just my caffeine deprived misinterpretation of his
    > terminology.
    >


    >>> sentence = "By the new group"
    >>> words = sentence.split()
    >>> words[words[0],words[-1]]

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: list indices must be integers, not tuple

    So why would the OP want a TypeError? Or has caffeine deprivation
    affected your typing skills? :)

    --
    "Steve is going for the pink ball - and for those of you who are
    watching in black and white, the pink is next to the green." Snooker
    commentator 'Whispering' Ted Lowe.

    Mark Lawrence
     
    Mark Lawrence, Jun 15, 2013
    #5
  6. rusi Guest

    On Jun 15, 3:55 pm, Mark Lawrence <> wrote:
    > On 15/06/2013 11:24, Denis McMahon wrote:
    >
    >
    >
    >
    >
    >
    >
    >
    >
    > > On Sat, 15 Jun 2013 10:05:01 +0000, Steven D'Aprano wrote:

    >
    > >> On Sat, 15 Jun 2013 02:42:55 -0700, subhabangalore wrote:

    >
    > >>> Dear Group,

    >
    > >>> I am trying to search the following pattern in Python.

    >
    > >>> I have following strings:

    >
    > >>>   (i)"In the ocean" (ii)"On the ocean" (iii) "By the ocean" (iv) "In
    > >>>   this group" (v) "In this group" (vi) "By the new group"
    > >>>         .....

    >
    > >>> I want to extract from the first word to the last word, where first
    > >>> word and last word are varying.

    >
    > >>> I am looking to extract out:
    > >>>    (i) the (ii) the (iii) the (iv) this (v) this (vi) the new
    > >>>        .....

    >
    > >>> The problem may be handled by converting the string to list and then
    > >>> index of list.

    >
    > >> No need for a regular expression.

    >
    > >> py> sentence = "By the new group"
    > >> py> words = sentence.split()
    > >> py> words[1:-1]
    > >> ['the', 'new']

    >
    > >> Does that help?

    >
    > > I thought OP wanted:

    >
    > > words[words[0],words[-1]]

    >
    > > But that might be just my caffeine deprived misinterpretation of his
    > > terminology.

    >
    >  >>> sentence = "By the new group"
    >  >>> words = sentence.split()
    >  >>> words[words[0],words[-1]]
    > Traceback (most recent call last):
    >    File "<stdin>", line 1, in <module>
    > TypeError: list indices must be integers, not tuple
    >
    > So why would the OP want a TypeError?  Or has caffeine deprivation
    > affected your typing skills? :)


    :)

    I guess Denis meant (words[0], words[-1])

    To the OP:
    You have the identity:
    words == [words[0]] + words[1:-1] + [words[-1]]

    So take your pick of what parts of the expression you want (and
    discard what you dont want).
    [The way you've used 'extract' is a bit ambiguous]
     
    rusi, Jun 15, 2013
    #6
  7. On Sat, 15 Jun 2013 11:55:34 +0100, Mark Lawrence wrote:

    > >>> sentence = "By the new group"
    > >>> words = sentence.split()
    > >>> words[words[0],words[-1]]

    > Traceback (most recent call last):
    > File "<stdin>", line 1, in <module>
    > TypeError: list indices must be integers, not tuple
    >
    > So why would the OP want a TypeError? Or has caffeine deprivation
    > affected your typing skills? :)


    Yeah - that last:

    words[words[0],words[-1]]

    should probably have been:

    first_and_last = [words[0], words[-1]]

    or even:

    first_and_last = (words[0], words[-1])

    Or even:

    first_and_last = [sentence.split() for i in (0, -1)]
    middle = sentence.split()[1:-2]

    --
    Denis McMahon,
     
    Denis McMahon, Jun 15, 2013
    #7
  8. On Sat, 15 Jun 2013 13:41:21 +0000, Denis McMahon wrote:

    > first_and_last = [sentence.split() for i in (0, -1)] middle =
    > sentence.split()[1:-2]


    Bugger! That last is actually:

    sentence.split()[1:-1]

    It just looks like a two.

    --
    Denis McMahon,
     
    Denis McMahon, Jun 15, 2013
    #8
  9. On 15/06/2013 14:45, Denis McMahon wrote:
    > On Sat, 15 Jun 2013 13:41:21 +0000, Denis McMahon wrote:
    >
    >> first_and_last = [sentence.split() for i in (0, -1)] middle =
    >> sentence.split()[1:-2]

    >
    > Bugger! That last is actually:
    >
    > sentence.split()[1:-1]
    >
    > It just looks like a two.
    >


    I've a very strong sense of deja vu having round the same loop what, two
    hours ago? Wondering out aloud the number of times a programmer has
    thought "That's easy, I don't need to test it". How are the mighty fallen.

    --
    "Steve is going for the pink ball - and for those of you who are
    watching in black and white, the pink is next to the green." Snooker
    commentator 'Whispering' Ted Lowe.

    Mark Lawrence
     
    Mark Lawrence, Jun 15, 2013
    #9
  10. Guest

    On Saturday, June 15, 2013 7:58:44 PM UTC+5:30, Mark Lawrence wrote:
    > On 15/06/2013 14:45, Denis McMahon wrote:
    >
    > > On Sat, 15 Jun 2013 13:41:21 +0000, Denis McMahon wrote:

    >
    > >

    >
    > >> first_and_last = [sentence.split() for i in (0, -1)] middle =

    >
    > >> sentence.split()[1:-2]

    >
    > >

    >
    > > Bugger! That last is actually:

    >
    > >

    >
    > > sentence.split()[1:-1]

    >
    > >

    >
    > > It just looks like a two.

    >
    > >

    >
    >
    >
    > I've a very strong sense of deja vu having round the same loop what, two
    >
    > hours ago? Wondering out aloud the number of times a programmer has
    >
    > thought "That's easy, I don't need to test it". How are the mighty fallen.
    >
    >
    >
    > --
    >
    > "Steve is going for the pink ball - and for those of you who are
    >
    > watching in black and white, the pink is next to the green." Snooker
    >
    > commentator 'Whispering' Ted Lowe.
    >
    >
    >
    > Mark Lawrence


    Dear Group,

    I know this solution but I want to have Regular Expression option. Just learning.

    Regards,
    Subhabrata.
     
    , Jun 15, 2013
    #10
  11. Andreas Perstinger, Jun 15, 2013
    #11
  12. On 15/06/2013 15:31, wrote:
    >
    > Dear Group,
    >
    > I know this solution but I want to have Regular Expression option. Just learning.
    >
    > Regards,
    > Subhabrata.
    >


    Start here http://docs.python.org/2/library/re.html

    Would you also please read and action this,
    http://wiki.python.org/moin/GoogleGroupsPython , thanks.

    --
    "Steve is going for the pink ball - and for those of you who are
    watching in black and white, the pink is next to the green." Snooker
    commentator 'Whispering' Ted Lowe.

    Mark Lawrence
     
    Mark Lawrence, Jun 15, 2013
    #12
  13. Guest

    On Saturday, June 15, 2013 8:34:59 PM UTC+5:30, Mark Lawrence wrote:
    > On 15/06/2013 15:31, wrote:
    >
    > >

    >
    > > Dear Group,

    >
    > >

    >
    > > I know this solution but I want to have Regular Expression option. Just learning.

    >
    > >

    >
    > > Regards,

    >
    > > Subhabrata.

    >
    > >

    >
    >
    >
    > Start here http://docs.python.org/2/library/re.html
    >
    >
    >
    > Would you also please read and action this,
    >
    > http://wiki.python.org/moin/GoogleGroupsPython , thanks.
    >
    >
    >
    > --
    >
    > "Steve is going for the pink ball - and for those of you who are
    >
    > watching in black and white, the pink is next to the green." Snooker
    >
    > commentator 'Whispering' Ted Lowe.
    >
    >
    >
    > Mark Lawrence


    Dear Group,

    Suppose I want a regular expression that matches both "Sent from my iPhone" and "Sent from my iPod". How do I write such an expression--is the problem,
    "Sent from my iPod"
    "Sent from my iPhone"

    which can be written as,
    re.compile("Sent from my (iPhone|iPod)")

    now if I want to slightly to extend it as,

    "Taken from my iPod"
    "Taken from my iPhone"

    I am looking how can I use or in the beginning pattern?

    and the third phase if the intermediate phrase,

    "from my" if also differs or changes.

    In a nutshell I want to extract a particular group of phrases,
    where, the beginning and end pattern may alter like,

    (i) either from beginning Pattern B1 to end Pattern E1,
    (ii) or from beginning Pattern B1 to end Pattern E2,
    (iii) or from beginning Pattern B2 to end Pattern E2,
    ......

    Regards,
    Subhabrata.
     
    , Jun 15, 2013
    #13
  14. On 15/06/2013 17:28, wrote:

    You've been pointed at several links, so what have you tried, and what,
    if anything, went wrong? Or do you simply not understand, in which case
    please say so and we'll help. I'm not trying to be awkward, it's simply
    known that you learn more if you try something yourself, rather than be
    spoon fed it.

    --
    "Steve is going for the pink ball - and for those of you who are
    watching in black and white, the pink is next to the green." Snooker
    commentator 'Whispering' Ted Lowe.

    Mark Lawrence
     
    Mark Lawrence, Jun 15, 2013
    #14
  15. Guest

    On 06/15/2013 03:42 AM, wrote:> Dear Group,
    >
    > I am trying to search the following pattern in Python.
    >
    > I have following strings:
    >
    > (i)"In the ocean"
    > (ii)"On the ocean"
    > (iii) "By the ocean"
    > (iv) "In this group"
    > (v) "In this group"
    > (vi) "By the new group"
    > .....
    >
    > I want to extract from the first word to the last word,
    > where first word and last word are varying.
    >
    > I am looking to extract out:
    > (i) the
    > (ii) the
    > (iii) the
    > (iv) this
    > (v) this
    > (vi) the new
    > .....
    >
    > The problem may be handled by converting the string to list and then
    > index of list.
    >
    > But I am thinking if I can use regular expression in Python.


    Since nobody here seems to want to answer your question
    (or seems even able to read it), I'll try. Is something
    like this what you want?

    import re

    texts = [
    '(i)"In the ocean"',
    '(ii)"On the ocean"',
    '(iii) "By the ocean"',
    '(iv) "In this group"',
    '(v) "In this group"',
    '(vi) "By the new group"']

    pattern = re.compile (r'^\((.*)\)\s*"\S+\s*(.*)\s\S+"$')
    for txt in texts:
    matchobj = re.search (pattern, txt)
    number, midtext = matchobj.group (1, 2)
    print ("(%s) %s" % (number, midtext))
     
    , Jun 15, 2013
    #15
  16. Guest

    On Saturday, June 15, 2013 3:12:55 PM UTC+5:30, wrote:
    > Dear Group,
    >
    >
    >
    > I am trying to search the following pattern in Python.
    >
    >
    >
    > I have following strings:
    >
    >
    >
    > (i)"In the ocean"
    >
    > (ii)"On the ocean"
    >
    > (iii) "By the ocean"
    >
    > (iv) "In this group"
    >
    > (v) "In this group"
    >
    > (vi) "By the new group"
    >
    > .....
    >
    >
    >
    > I want to extract from the first word to the last word,
    >
    > where first word and last word are varying.
    >
    >
    >
    > I am looking to extract out:
    >
    > (i) the
    >
    > (ii) the
    >
    > (iii) the
    >
    > (iv) this
    >
    > (v) this
    >
    > (vi) the new
    >
    > .....
    >
    >
    >
    > The problem may be handled by converting the string to list and then
    >
    > index of list.
    >
    >
    >
    > But I am thinking if I can use regular expression in Python.
    >
    >
    >
    > If any one of the esteemed members can help.
    >
    >
    >
    > Thanking you in Advance,
    >
    >
    >
    > Regards,
    >
    > Subhabrata


    Dear Group,

    Thank you for the answer. But I want to learn bit of interesting regular expression forms where may I? No Mark, thank you for your links but they were not sufficient. I am looking for more intriguing exercises, esp use of or in the pattern search.

    Regards,
    Subhabrata.
     
    , Jun 15, 2013
    #16
  17. Guest

    On Saturday, June 15, 2013 11:54:28 AM UTC-6, wrote:

    > Thank you for the answer. But I want to learn bit of interesting
    > regular expression forms where may I?
    > No Mark, thank you for your links but they were not sufficient.


    Links to the Python reference documentation are useful for people
    just beginning with some aspect of Python; they are for people who
    already know Python and want to look up details. So it's no
    surprise that you did not find them useful.

    > I am looking for more intriguing exercises, esp use of or in
    > the pattern search.


    Have you tried searching on Google for "regular expression tutorial"?
    It gives a lot of results. I've never tried any of them so I can't
    recommend any one specifically but maybe you can find something
    useful there?

    There is also a Python Howto on regular expressions at
    http://docs.python.org/3/howto/regex.html

    Also, maybe the book "Regular Expressions Cookbook" would
    be useful? It seems to have a lot of specific expressions
    for accomplishing various tasks and seems to be online for
    free at
    http://it-ebooks.info/read/920/
     
    , Jun 15, 2013
    #17
  18. Guest

    On Sunday, June 16, 2013 12:17:18 AM UTC+5:30, wrote:
    > On Saturday, June 15, 2013 11:54:28 AM UTC-6, wrote:
    >
    >
    >
    > > Thank you for the answer. But I want to learn bit of interesting

    >
    > > regular expression forms where may I?

    >
    > > No Mark, thank you for your links but they were not sufficient.

    >
    >
    >
    > Links to the Python reference documentation are useful for people
    >
    > just beginning with some aspect of Python; they are for people who
    >
    > already know Python and want to look up details. So it's no
    >
    > surprise that you did not find them useful.
    >
    >
    >
    > > I am looking for more intriguing exercises, esp use of or in

    >
    > > the pattern search.

    >
    >
    >
    > Have you tried searching on Google for "regular expression tutorial"?
    >
    > It gives a lot of results. I've never tried any of them so I can't
    >
    > recommend any one specifically but maybe you can find something
    >
    > useful there?
    >
    >
    >
    > There is also a Python Howto on regular expressions at
    >
    > http://docs.python.org/3/howto/regex.html
    >
    >
    >
    > Also, maybe the book "Regular Expressions Cookbook" would
    >
    > be useful? It seems to have a lot of specific expressions
    >
    > for accomplishing various tasks and seems to be online for
    >
    > free at
    >
    > http://it-ebooks.info/read/920/


    Dear Group,

    Thank you for the links. Yes, HOW-TO is good. The cook book should be good.Internet changes its contents so fast few days back there was a very good Regular Expression Tutorial by Alan Gauld or there were some mail discussions, I don't know where they are gone. There is one Gauld's tutorial but I think I read some think different.

    Regards,
    Subhabrata.
     
    , Jun 15, 2013
    #18
  19. Terry Reedy Guest

    On 6/15/2013 12:28 PM, wrote:

    > Suppose I want a regular expression that matches both "Sent from my iPhone" and "Sent from my iPod". How do I write such an expression--is the problem,
    > "Sent from my iPod"
    > "Sent from my iPhone"
    >
    > which can be written as,
    > re.compile("Sent from my (iPhone|iPod)")
    >
    > now if I want to slightly to extend it as,
    >
    > "Taken from my iPod"
    > "Taken from my iPhone"
    >
    > I am looking how can I use or in the beginning pattern?
    >
    > and the third phase if the intermediate phrase,
    >
    > "from my" if also differs or changes.
    >
    > In a nutshell I want to extract a particular group of phrases,
    > where, the beginning and end pattern may alter like,
    >
    > (i) either from beginning Pattern B1 to end Pattern E1,
    > (ii) or from beginning Pattern B1 to end Pattern E2,
    > (iii) or from beginning Pattern B2 to end Pattern E2,


    The only hints I will add to those given is that you need a) pattern for
    a word, and b) a way to 'anchor' the pattern to the beginning and ending
    of the string so it will only match the first and last words.

    This is a pretty good re practice problem, so go and practice and
    experiment. Expect to fail 20 times and you should beat your
    expectation ;-). The interactive interpreter, or Idle with its F5 Run
    editor window, makes experimenting easy and (for me) fun.

    --
    Terry Jan Reedy
     
    Terry Reedy, Jun 15, 2013
    #19
  20. Guest

    Oops...

    On Saturday, June 15, 2013 12:47:18 PM UTC-6, wrote:
    > Links to the Python reference documentation are useful for people
    > just beginning with some aspect of Python; they are for people who
    > already know Python and want to look up details.


    That was supposed to be:
    Links to the Python reference documentation are NOT useful for people
    just beginning with some aspect of Python

    and as long as I'm revising, I mean that as a general statement,
    nothing wrong with a reference doc link accompanying a simpler
    explanation or pointer thereto.
     
    , Jun 15, 2013
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,398
  2. MARTIN Herve
    Replies:
    1
    Views:
    516
    Roedy Green
    Jul 22, 2003
  3. Johann Sijpkes
    Replies:
    2
    Views:
    477
    Johann Sijpkes
    Jul 14, 2004
  4. Peter Hanke
    Replies:
    1
    Views:
    164
    Dr.Ruud
    Jan 6, 2008
  5. Jimmy
    Replies:
    13
    Views:
    457
    Arne Vajhøj
    Jul 25, 2012
Loading...

Share This Page