RE: regex for url paramter

Discussion in 'Python' started by Robert Brewer, Dec 7, 2004.

  1. Andreas Volz wrote:
    > I try to extract a http target from a URL that is given as parameter.
    > urlparse couldn't really help me. I tried it like this
    >
    > url="http://www.example.com/example.html?url=http://www.exampl
    > e.org/exa
    > mple.html"
    >
    > p = re.compile( '.*url=')
    > url = p.sub( '', url)
    > print url
    > > http://www.example.org/example.html

    >
    > This works, but if there're more parameters it doesn't work:
    >
    > url2="http://www.example.com/example.html?url=http://www.examp
    > le.org/exa
    > mple.html&param=1"
    >
    > p = re.compile( '.*url=')
    > url2 = p.sub( '', url2)
    > print url2
    > > http://www.example.org/example.html&param=1

    >
    > I played with regex to find one that matches also second case with
    > multible parameters. I think it's easy, but I don't know how
    > to do. Can you help me?


    I'd go back to urlparse if I were you.

    >>> import urlparse
    >>>

    url="http://www.example.com/example.html?url=http://www.example.org/exam
    ple.html"
    >>> urlparse.urlparse(url)

    ('http', 'www.example.com', '/example.html', '',
    'url=http://www.example.org/example.html', '')
    >>> query = urlparse.urlparse(url)[4]
    >>> params = [p.split("=", 1) for p in query.split("&")]
    >>> params

    [['url', 'http://www.example.org/example.html']]
    >>> urlparse.urlparse(params[0][1])

    ('http', 'www.example.org', '/example.html', '', '', '')


    Robert Brewer
    MIS
    Amor Ministries
    Robert Brewer, Dec 7, 2004
    #1
    1. Advertising

  2. Robert Brewer

    Paul McGuire Guest

    "Robert Brewer" <> wrote in message
    news:...
    Andreas Volz wrote:
    > I try to extract a http target from a URL that is given as parameter.
    > urlparse couldn't really help me. I tried it like this
    >
    > url="http://www.example.com/example.html?url=http://www.exampl
    > e.org/exa
    > mple.html"
    >
    > p = re.compile( '.*url=')
    > url = p.sub( '', url)
    > print url
    > > http://www.example.org/example.html

    >
    > This works, but if there're more parameters it doesn't work:
    >
    > url2="http://www.example.com/example.html?url=http://www.examp
    > le.org/exa
    > mple.html&param=1"
    >
    > p = re.compile( '.*url=')
    > url2 = p.sub( '', url2)
    > print url2
    > > http://www.example.org/example.html&param=1

    >
    > I played with regex to find one that matches also second case with
    > multible parameters. I think it's easy, but I don't know how
    > to do. Can you help me?



    I'd go back to urlparse if I were you.

    >>> import urlparse
    >>>

    url="http://www.example.com/example.html?url=http://www.example.org/example.
    html"
    >>> urlparse.urlparse(url)

    ('http', 'www.example.com', '/example.html',
    '','url=http://www.example.org/example.html', '')
    >>> query = urlparse.urlparse(url)[4]
    >>> params = [p.split("=", 1) for p in query.split("&")]
    >>> params

    [['url', 'http://www.example.org/example.html']]
    >>> urlparse.urlparse(params[0][1])

    ('http', 'www.example.org', '/example.html', '', '', '')


    << Added by Paul>>

    Robert Brewer's params list comprehension may be a bit much to swallow all
    at once for someone new to Python, but it is a very slick example, and it
    works for multiple parameters.
    [p.split("=", 1) for p in query.split("&")]

    First of all, you see that the variable query is returned from urlparse and
    contains everything in the original url after the '?' mark. Now the list
    comprehension contains 'query.split("&")' - this will return a list of
    strings containing each of the individual parameter assignments. 'for p in
    query.split("&")' will iterate over this list and give us back the temporary
    variable 'p' representing each individual parameter in turn. For example [p
    for p in query.split("&")] is sort of a nonsense list comprehension, it just
    builds a list from the list returned from query.split("&"). But instead,
    Robert splits each 'p' at its equals sign, so for each parameter we get a
    2-element list: the parameter, and its assigned value. Using a list
    comprehension does all of this iteration and list building in one single,
    compact statement.

    A long spelled out version would look like:
    allparams = query.split("&")
    params = []
    for p in allparams:
    params.append( p.split("=",1) )

    Now if we make a slight change Robert Brewer's "params = [p.split..." line
    to, and construct a dictionary using dict():
    params = dict( [p.split("=", 1) for p in query.split("&")] )
    this will create a dictionary for you (the dict() constructor will accept a
    list of pairs, and interpret them as key-value entries into the dictionary).
    Then you can reference the params by name. Here's the example, with more
    than one param in the url.

    >>>

    url="http://www.example.com/example.html?url=http://www.example.org/example.
    html&url2=http://www.xyzzy.net/zork.html"
    >>> print urlparse.urlparse(url)

    ('http', 'www.example.com', '/example.html', '',
    'url=http://www.example.org/example.html&url2=http://www.xyzzy.net/zork.html
    ', '')
    >>> query = urlparse.urlparse(url)[4]
    >>> params = dict([p.split("=", 1) for p in query.split("&")])
    >>> print params

    {'url': 'http://www.example.org/example.html', 'url2':
    'http://www.xyzzy.net/zork.html'}
    >>> print params.keys()

    ['url', 'url2']
    >>> print params['url']

    http://www.example.org/example.html
    >>> print params['url2']

    http://www.xyzzy.net/zork.html


    List comprehensions are another powerful tool to put in your Python toolbox.

    Keep pluggin' away, Andreas!

    -- Paul
    Paul McGuire, Dec 8, 2004
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Randy

    Grabbing paramter in the URL

    Randy, Feb 8, 2005, in forum: ASP .Net
    Replies:
    3
    Views:
    356
    Eliyahu Goldin
    Feb 8, 2005
  2. tshad
    Replies:
    5
    Views:
    525
    Steve C. Orr [MVP, MCSD]
    May 17, 2005
  3. Islamegy®

    SQL Output Paramter problem

    Islamegy®, Apr 16, 2006, in forum: ASP .Net
    Replies:
    1
    Views:
    2,218
  4. Ranginald
    Replies:
    2
    Views:
    583
    Ranginald
    May 3, 2006
  5. Andreas Volz

    regex for url paramter

    Andreas Volz, Dec 7, 2004, in forum: Python
    Replies:
    0
    Views:
    286
    Andreas Volz
    Dec 7, 2004
Loading...

Share This Page