[Newbie Q on String & List Manipulation]

Discussion in 'Python' started by Matthew, Apr 15, 2004.

  1. Matthew

    Matthew Guest

    Hello All,

    today is the first day i try to programming in Python,
    my assignment is, write a silly script that probably
    will run a few times a day to check if the Gmail services
    is ready or not. ;)

    however, i encountered some problem when playing with the
    list and string.

    i'm using Python 2.2.2 on Redhat. if i write something like:

    a = "one"
    b = "two"
    a += b
    print a

    i will get:

    onetwo

    ok, seems quite ok, however, not sure why it doesn't work on
    my silly Gmail script (pls refer to my script belows):

    for item in thecookies:
    mycookies += item

    print mycookies

    i have exactly 4 items in the "thecookies" list, however, when
    printing out "mycookies", it just show the last item (in fact,
    seems the 4 items have been overlapped each others).

    could somebody pls kindly take a look at my silly script and
    gimme some advise?

    thanks very much in advance! :)

    ---
    matthew




    import re
    import string
    import sys
    import urllib

    user = ""
    pswd = "dapassword"

    schm = "https://"
    host = "www.google.com"
    path = "/accounts/ServiceLoginBoxAuth"
    qstr = {"service" : "mail", \
    "continue" : "http://gmail.google.com/", \
    "Email" : user, \
    "Passwd" : pswd}

    qstr = urllib.urlencode(qstr)

    url = schm + host + path + "?" + qstr

    conn = urllib.urlopen(url)

    headers = conn.info().headers
    response = conn.read()

    thecookies = []

    #
    # extract all the Set-Cookie from the HTTP response header and put it in thecookies
    #

    for header in headers:
    matches = re.compile("^Set-Cookie: (.*)$").search(header)
    if matches:
    thecookies.append(matches.group(1))

    #
    # make sure we've grep the SID or die
    #

    foundsessionid = 0

    for item in thecookies:
    if re.compile("^SID").search(item):
    foundsessionid = 1
    break

    if not foundsessionid:
    print "> Failded to retrieve the \"SID\" cookie"
    sys.exit()

    #
    # grep the GV cookie from the HTTP response or die
    #

    matches = re.compile("^\s*var cookieVal= \"(.*)\";.*", re.M).search(response)

    if matches:
    thecookies.append("GV=" + matches.group(1))
    else:
    print "> Failed to retrieve the \"GV\" cookie"
    sys.exit()

    print thecookies

    mycookies = ""

    for item in thecookies:
    mycookies += item

    print mycookies

    #
    # still got many things to do right here...
    #

    sys.exit()
    Matthew, Apr 15, 2004
    #1
    1. Advertising

  2. Matthew wrote:

    >Hello All,
    >
    > today is the first day i try to programming in Python,
    > my assignment is, write a silly script that probably
    > will run a few times a day to check if the Gmail services
    > is ready or not. ;)
    >
    > however, i encountered some problem when playing with the
    > list and string.
    >
    >[...]
    >
    > for item in thecookies:
    > mycookies += item
    >[...]
    >
    >

    Try:

    mycookies = string.join(thecookies, "")
    Gabriel Cooper, Apr 15, 2004
    #2
    1. Advertising

  3. Matthew

    Larry Bates Guest

    If you want to join all the items in a list together
    try:

    mycookies="".join(thecookies)

    Secondly, your append to the cookies list is not
    inside your loop. In your code it will only
    get executed a single time (after exiting the
    loop) which is most likely why you only see the
    LAST item. Remember that indention in Python
    has meaning!

    Something more like this:

    for item in thecookies:
    if re.compile("^SID").search(item):
    foundsessionid = 1
    break

    if not foundsessionid:
    print '> Failed to retrieve the "SID" cookie'
    sys.exit()

    #
    # grep the GV cookie from the HTTP response or die
    #
    matches = re.compile('^\s*var cookieVal= "(.*)";.*',
    re.M).search(response)
    if matches:
    thecookies.append("GV=" + matches.group(1))
    else:
    print '> Failed to retrieve the "GV" cookie'
    sys.exit()

    print thecookies

    mycookies = "".join(thecookies)

    Regards,
    Larry Bates
    Syscon, Inc.

    "Matthew" <> wrote in message
    news:...
    > Hello All,
    >
    > today is the first day i try to programming in Python,
    > my assignment is, write a silly script that probably
    > will run a few times a day to check if the Gmail services
    > is ready or not. ;)
    >
    > however, i encountered some problem when playing with the
    > list and string.
    >
    > i'm using Python 2.2.2 on Redhat. if i write something like:
    >
    > a = "one"
    > b = "two"
    > a += b
    > print a
    >
    > i will get:
    >
    > onetwo
    >
    > ok, seems quite ok, however, not sure why it doesn't work on
    > my silly Gmail script (pls refer to my script belows):
    >
    > for item in thecookies:
    > mycookies += item
    >
    > print mycookies
    >
    > i have exactly 4 items in the "thecookies" list, however, when
    > printing out "mycookies", it just show the last item (in fact,
    > seems the 4 items have been overlapped each others).
    >
    > could somebody pls kindly take a look at my silly script and
    > gimme some advise?
    >
    > thanks very much in advance! :)
    >
    > ---
    > matthew
    >
    >
    >
    >
    > import re
    > import string
    > import sys
    > import urllib
    >
    > user = ""
    > pswd = "dapassword"
    >
    > schm = "https://"
    > host = "www.google.com"
    > path = "/accounts/ServiceLoginBoxAuth"
    > qstr = {"service" : "mail", \
    > "continue" : "http://gmail.google.com/", \
    > "Email" : user, \
    > "Passwd" : pswd}
    >
    > qstr = urllib.urlencode(qstr)
    >
    > url = schm + host + path + "?" + qstr
    >
    > conn = urllib.urlopen(url)
    >
    > headers = conn.info().headers
    > response = conn.read()
    >
    > thecookies = []
    >
    > #
    > # extract all the Set-Cookie from the HTTP response header and put it in

    thecookies
    > #
    >
    > for header in headers:
    > matches = re.compile("^Set-Cookie: (.*)$").search(header)
    > if matches:
    > thecookies.append(matches.group(1))
    >
    > #
    > # make sure we've grep the SID or die
    > #
    >
    > foundsessionid = 0
    >
    > for item in thecookies:
    > if re.compile("^SID").search(item):
    > foundsessionid = 1
    > break
    >
    > if not foundsessionid:
    > print "> Failded to retrieve the \"SID\" cookie"
    > sys.exit()
    >
    > #
    > # grep the GV cookie from the HTTP response or die
    > #
    >
    > matches = re.compile("^\s*var cookieVal= \"(.*)\";.*",

    re.M).search(response)
    >
    > if matches:
    > thecookies.append("GV=" + matches.group(1))
    > else:
    > print "> Failed to retrieve the \"GV\" cookie"
    > sys.exit()
    >
    > print thecookies
    >
    > mycookies = ""
    >
    > for item in thecookies:
    > mycookies += item
    >
    > print mycookies
    >
    > #
    > # still got many things to do right here...
    > #
    >
    > sys.exit()
    Larry Bates, Apr 15, 2004
    #3
  4. Matthew

    wes weston Guest

    Matthew wrote:
    > Hello All,
    >
    > today is the first day i try to programming in Python,
    > my assignment is, write a silly script that probably
    > will run a few times a day to check if the Gmail services
    > is ready or not. ;)
    >
    > however, i encountered some problem when playing with the
    > list and string.
    >
    > i'm using Python 2.2.2 on Redhat. if i write something like:
    >
    > a = "one"
    > b = "two"
    > a += b
    > print a
    >
    > i will get:
    >
    > onetwo
    >
    > ok, seems quite ok, however, not sure why it doesn't work on
    > my silly Gmail script (pls refer to my script belows):
    >
    > for item in thecookies:
    > mycookies += item
    >
    > print mycookies
    >
    > i have exactly 4 items in the "thecookies" list, however, when
    > printing out "mycookies", it just show the last item (in fact,
    > seems the 4 items have been overlapped each others).
    >
    > could somebody pls kindly take a look at my silly script and
    > gimme some advise?
    >
    > thanks very much in advance! :)
    >
    > ---
    > matthew
    >
    >
    >
    >
    > import re
    > import string
    > import sys
    > import urllib
    >
    > user = ""
    > pswd = "dapassword"
    >
    > schm = "https://"
    > host = "www.google.com"
    > path = "/accounts/ServiceLoginBoxAuth"
    > qstr = {"service" : "mail", \
    > "continue" : "http://gmail.google.com/", \
    > "Email" : user, \
    > "Passwd" : pswd}
    >
    > qstr = urllib.urlencode(qstr)
    >
    > url = schm + host + path + "?" + qstr
    >
    > conn = urllib.urlopen(url)
    >
    > headers = conn.info().headers
    > response = conn.read()
    >
    > thecookies = []
    >
    > #
    > # extract all the Set-Cookie from the HTTP response header and put it in thecookies
    > #
    >
    > for header in headers:
    > matches = re.compile("^Set-Cookie: (.*)$").search(header)
    > if matches:
    > thecookies.append(matches.group(1))
    >
    > #
    > # make sure we've grep the SID or die
    > #
    >
    > foundsessionid = 0
    >
    > for item in thecookies:
    > if re.compile("^SID").search(item):
    > foundsessionid = 1
    > break
    >
    > if not foundsessionid:
    > print "> Failded to retrieve the \"SID\" cookie"
    > sys.exit()
    >
    > #
    > # grep the GV cookie from the HTTP response or die
    > #
    >
    > matches = re.compile("^\s*var cookieVal= \"(.*)\";.*", re.M).search(response)
    >
    > if matches:
    > thecookies.append("GV=" + matches.group(1))
    > else:
    > print "> Failed to retrieve the \"GV\" cookie"
    > sys.exit()
    >
    > print thecookies
    >
    > mycookies = ""
    >
    > for item in thecookies:
    > mycookies += item
    >
    > print mycookies
    >
    > #
    > # still got many things to do right here...
    > #
    >
    > sys.exit()


    >>> sum=""
    >>> list=["a","b","c","d"]
    >>> for x in list:

    .... sum+=x
    ....
    >>> sum

    'abcd'
    wes weston, Apr 15, 2004
    #4
  5. Matthew

    Matthew Guest

    hello all,

    thanks very much for you guys' replies. :)

    i guess i already have properly indented the code,
    and i also tried the string.join() method.

    i changed the script a bit to test both the a+=b
    and the string.join().

    pls kindly take a look at the output and the script
    below.

    thanks in advance.

    ---
    matthew


    ##########
    # OUTPUT #
    ##########

    === the content of the "thecookies" list ===
    ['=zh_HK; Expires=Sat, 16-Apr-05 04:57:58 GMT; Path=/\r',
    'Session=zh_HK\r', 'SID=AejhWhifAlWLXGi3lnBd3PiLeNUkoasZRP9kKXc0Es_o;Domain=.google.com;Path=/\r',
    'GV=fbf1ad9eb8-4bbb676189c513f10bfa42556f57c6ac']

    === the content of the string: "mycookies" ===
    GV=fbf1ad9eb8-4bbb676189c513f10bfa42556f57c6ac_o;Domain=.google.com;Path=/

    === use the string.join(): "".join(thecookies) ===
    GV=fbf1ad9eb8-4bbb676189c513f10bfa42556f57c6ac_o;Domain=.google.com;Path=/

    ############
    # Gmail.py #
    ############

    #
    # A Python script that would logging into the Gmail services and check
    # if the message is still the "Sorry, Gmail is in limited test
    mode..."
    #
    # Matthew Wong <> 2004-04-15
    #

    import re
    import string
    import sys
    import urllib

    user = ""
    pswd = "maddog4096"

    schm = "https://"
    host = "www.google.com"
    path = "/accounts/ServiceLoginBoxAuth"
    qstr = {"service" : "mail", \
    "continue" : "http://gmail.google.com/", \
    "Email" : user, \
    "Passwd" : pswd}

    qstr = urllib.urlencode(qstr)

    url = schm + host + path + "?" + qstr

    conn = urllib.urlopen(url)

    headers = conn.info().headers
    response = conn.read()

    thecookies = []

    #
    # extract all the Set-Cookie from the HTTP response header and put it
    in thecookies
    #

    for header in headers:
    matches = re.compile("^Set-Cookie: (.*)$").search(header)
    if matches:
    thecookies.append(matches.group(1))

    #
    # make sure we've grep the SID or die
    #

    foundsessionid = 0

    for item in thecookies:
    if re.compile("^SID").search(item):
    foundsessionid = 1
    break

    if not foundsessionid:
    print "> Failded to retrieve the \"SID\" cookie"
    sys.exit()

    #
    # grep the GV cookie from the HTTP response or die
    #

    matches = re.compile("^\s*var cookieVal= \"(.*)\";.*",
    re.M).search(response)

    if matches:
    thecookies.append("GV=" + matches.group(1))
    else:
    print "> Failed to retrieve the \"GV\" cookie"
    sys.exit()

    #
    # dump the content of the list: thecookies
    #

    print "=== the content of the \"thecookies\" list ==="
    print thecookies
    print "\n"

    #
    # join the items in the "thecookies" list to
    # the "mycookies" string by using the a += b
    #

    mycookies = ""

    for item in thecookies:
    mycookies += item

    print "=== the content of the string: \"mycookies\" ==="
    print mycookies
    print "\n"

    #
    # join the items in the "thecookies" list to
    # the "mycookies" string by using the string.join()
    #

    print "=== use the string.join(): \"\".join(thecookies) ==="
    print "".join(thecookies)
    print "\n"

    #
    # still got many things to do right here...
    #

    sys.exit()
    Matthew, Apr 16, 2004
    #5
  6. Matthew

    Matthew Guest

    Hello all,

    finally, i found a way to make to a+=b works,
    but, i don't understand why it works... ;(

    i changed the script from:

    for item in thecookies:
    mycookies += item

    to:

    for item in thecookies:
    mycookies += repr(item)

    and thing works fine.

    i've also check the "type" of both the "item" &
    "mycookies" and they both are "str".

    i don't understand why i need to use the repr to
    make it work...

    sigh...

    ---
    matthew
    Matthew, Apr 16, 2004
    #6
  7. In article <>,
    (Matthew) wrote:

    > finally, i found a way to make to a+=b works,
    > but, i don't understand why it works... ;(


    I was seeing a lot of newline (^M) characters at the ends of the strings
    you posted. When you concatenate them together then output them, the ^M
    may cause the lines to overwrite each other causing you to think you're
    only seeing the last one. But repr turns these characters into a
    sequence of two characters, backslash followed by r. Is this your
    problem? If so, maybe you want to call strip() on your strings before
    concatenating or joining them?

    --
    David Eppstein http://www.ics.uci.edu/~eppstein/
    Univ. of California, Irvine, School of Information & Computer Science
    David Eppstein, Apr 16, 2004
    #7
  8. Matthew

    Joe Mason Guest

    In article <>, Matthew wrote:
    > ok, seems quite ok, however, not sure why it doesn't work on
    > my silly Gmail script (pls refer to my script belows):
    >
    > for item in thecookies:
    > mycookies += item
    >
    > print mycookies
    >
    > i have exactly 4 items in the "thecookies" list, however, when
    > printing out "mycookies", it just show the last item (in fact,
    > seems the 4 items have been overlapped each others).


    I had to comment out the SID and GV tests, because they kept failing,
    but then I got:

    ['=en_CA; Expires=Sat, 16-Apr-05 06:35:14 GMT; Path=/\r',
    'Session=en_CA\r']
    Session=en_CAes=Sat, 16-Apr-05 06:35:14 GMT; Path=/

    Note the "\r" at the end of each cookie. That's carriage return, which
    moves the cursor back to the beginning of the line. mycookies actually
    contains all the data you want, but the print statement interprets the
    control character so they overwrite each other.

    Tip for future debugging: you don't need to do "sys.exit" at the end.
    It will automatically exit for you. And if you don't do that, you can
    run your script with "python -i" and get an interactive prompt when the
    script is finished, so you can examine the variables directly instead of
    going through print:

    >>> mycookies

    '=en_CA; Expires=Sat, 16-Apr-05 06:44:23 GMT; Path=/\rSession=en_CA\r'

    Joe
    Joe Mason, Apr 16, 2004
    #8
  9. Matthew

    Matthew Guest

    Hello David &Joe,

    thanks very much for David's information about
    the "newline" and thanks very much for Joe's tips
    on sys.exit() and the "-i" parameter for debugging.

    =)

    ---
    matthew
    Matthew, Apr 24, 2004
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GrelEns

    i'm lost in list manipulation

    GrelEns, Mar 3, 2004, in forum: Python
    Replies:
    9
    Views:
    327
    GrelEns
    Mar 4, 2004
  2. Roman

    List Manipulation

    Roman, Jul 4, 2006, in forum: Python
    Replies:
    23
    Views:
    929
    Gerard Flanagan
    Jul 5, 2006
  3. Morris.C
    Replies:
    4
    Views:
    360
    Maxim Yegorushkin
    Sep 8, 2005
  4. bit manipulation..newbie doubt

    , Feb 1, 2008, in forum: C Programming
    Replies:
    9
    Views:
    387
    Walter Roberson
    Feb 3, 2008
  5. Sengly
    Replies:
    4
    Views:
    290
    John Machin
    Jun 8, 2008
Loading...

Share This Page