Re: Curious to see alternate approach on a search/replace via regex

Discussion in 'Python' started by rh, Feb 7, 2013.

  1. rh

    rh Guest

    On Thu, 07 Feb 2013 10:49:06 +0100
    Peter Otten <> wrote:

    > rh wrote:
    >
    > > I am curious to know if others would have done this differently.
    > > And if so how so?
    > >
    > > This converts a url to a more easily managed filename, stripping the
    > > http protocol off.
    > >
    > > This:
    > >
    > > http://alongnameofasite1234567.com/q?sports=run&a=1&b=1
    > >
    > > becomes this:
    > >
    > > alongnameofasite1234567_com_q_sports_run_a_1_b_1
    > >
    > >
    > > def u2f(u):
    > > nx = re.compile(r'https?://(.+)$')
    > > u = nx.search(u).group(1)
    > > ux = re.compile(r'([-:./?&=]+)')
    > > return ux.sub('_', u)
    > >
    > > One alternate is to not do the compile step. There must also be a
    > > way to do it all at once. i.e. remove the protocol and replace the
    > > chars.

    >
    > Completely without regular expressions:
    >
    > import string
    >
    > ILLEGAL = "-:./?&="
    > try:
    > TRANS = string.maketrans(ILLEGAL, "_" * len(ILLEGAL))
    > except AttributeError:
    > # python 3
    > TRANS = dict.fromkeys(map(ord, ILLEGAL), "_")
    >
    > PROTOCOLS = {"http", "https"}
    >
    > def url_to_file(url):
    > protocol, sep, rest = url.partition("://")
    > if protocol not in PROTOCOLS:
    > raise ValueError
    > return rest.translate(TRANS)
    >
    > if __name__ == "__main__":
    > url = "http://alongnameofasite1234567.com/q?sports=run&a=1&b=1"
    > print(url)
    > print(url_to_file(url))


    2.7.3 is 85% faster than 3.3.0
    (no printing in my test)
     
    rh, Feb 7, 2013
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. rh
    Replies:
    6
    Views:
    137
    Nick Mellor
    Feb 8, 2013
  2. Demian Brecht
    Replies:
    18
    Views:
    247
  3. Demian Brecht
    Replies:
    0
    Views:
    140
    Demian Brecht
    Feb 6, 2013
  4. MRAB
    Replies:
    0
    Views:
    117
  5. Peter Otten
    Replies:
    0
    Views:
    95
    Peter Otten
    Feb 7, 2013
Loading...

Share This Page