String concatenation vs. string formatting

Discussion in 'Python' started by Andrew Berg, Jul 8, 2011.

  1. Andrew Berg

    Andrew Berg Guest

    Is it bad practice to use this
    > logger.error(self.preset_file + ' could not be stored - ' +
    > sys.exc_info()[1])

    Instead of this?
    > logger.error('{file} could not be stored -
    > {error}'.format(file=self.preset_file, error=sys.exc_info()[1]))



    Other than the case where a variable isn't a string (format() converts
    variables to strings, automatically, right?) and when a variable is used
    a bunch of times, concatenation is fine, but somehow, it seems wrong.
    Sorry if this seems a bit silly, but I'm a novice when it comes to
    design. Plus, there's not really supposed to be "more than one way to do
    it" in Python.
    Andrew Berg, Jul 8, 2011
    #1
    1. Advertising

  2. Andrew Berg

    John Gordon Guest

    In <> Andrew Berg <> writes:

    > Is it bad practice to use this
    > > logger.error(self.preset_file + ' could not be stored - ' +
    > > sys.exc_info()[1])

    > Instead of this?
    > > logger.error('{file} could not be stored -
    > > {error}'.format(file=self.preset_file, error=sys.exc_info()[1]))


    > Other than the case where a variable isn't a string (format() converts
    > variables to strings, automatically, right?) and when a variable is used
    > a bunch of times, concatenation is fine, but somehow, it seems wrong.
    > Sorry if this seems a bit silly, but I'm a novice when it comes to
    > design. Plus, there's not really supposed to be "more than one way to do
    > it" in Python.


    Concatenation feels ugly/clunky to me.

    I prefer this usage:

    logger.error('%s could not be stored - %s' % \
    (self.preset_file, sys.exc_info()[1]))

    --
    John Gordon A is for Amy, who fell down the stairs
    B is for Basil, assaulted by bears
    -- Edward Gorey, "The Gashlycrumb Tinies"
    John Gordon, Jul 8, 2011
    #2
    1. Advertising

  3. Andrew Berg

    Billy Mays Guest

    On 07/08/2011 04:18 PM, Andrew Berg wrote:
    > Is it bad practice to use this
    >> logger.error(self.preset_file + ' could not be stored - ' +
    >> sys.exc_info()[1])

    > Instead of this?
    >> logger.error('{file} could not be stored -
    >> {error}'.format(file=self.preset_file, error=sys.exc_info()[1]))

    >
    >
    > Other than the case where a variable isn't a string (format() converts
    > variables to strings, automatically, right?) and when a variable is used
    > a bunch of times, concatenation is fine, but somehow, it seems wrong.
    > Sorry if this seems a bit silly, but I'm a novice when it comes to
    > design. Plus, there's not really supposed to be "more than one way to do
    > it" in Python.

    If it means anything, I think concatenation is faster.

    __TIMES__
    a() - 0.09s
    b() - 0.09s
    c() - 54.80s
    d() - 5.50s

    Code is below:

    def a(v):
    out = ""
    for i in xrange(1000000):
    out += v
    return len(out)

    def b(v):
    out = ""
    for i in xrange(100000):
    out += v+v+v+v+v+v+v+v+v+v
    return len(out)

    def c(v):
    out = ""
    for i in xrange(1000000):
    out = "%s%s" % (out, v)
    return len(out)

    def d(v):
    out = ""
    for i in xrange(100000):
    out = "%s%s%s%s%s%s%s%s%s%s%s" % (out,v,v,v,v,v,v,v,v,v,v)
    return len(out)

    print "a", a('xxxxxxxxxx')
    print "b", b('xxxxxxxxxx')
    print "c", c('xxxxxxxxxx')
    print "d", d('xxxxxxxxxx')

    import profile

    profile.run("a('xxxxxxxxxx')")
    profile.run("b('xxxxxxxxxx')")
    profile.run("c('xxxxxxxxxx')")
    profile.run("d('xxxxxxxxxx')")
    Billy Mays, Jul 8, 2011
    #3
  4. Andrew Berg

    Andrew Berg Guest

    On 2011.07.08 05:59 PM, Ben Finney wrote:
    > With the caveat that the formatting of that line should be using PEP 8
    > indentation for clarity:

    PEP 8 isn't bad, but I don't agree with everything in it. Certain lines
    look good in chunks, some don't, at least to me. It's quite likely I'm
    going to be writing 98%, if not more, of this project's code, so what
    looks good to me matters more than a standard (as long as the code
    works). Obviously, if I need to work in a team, then things change.
    > > and when a variable is used a bunch of times, concatenation is fine,

    I prefaced that sentence with "Other than the case", as in "except for
    the following case(s)".
    > There is often more than one way to do it. The Zen of Python is explicit
    > that there should be one obvious way to do it (and preferably only one).

    I meant in contrast to the idea of intentionally having multiple ways to
    do something, all with roughly equal merit.



    On 2011.07.08 04:38 PM, Ian Kelly wrote:
    > Also, string formatting (especially using the new syntax like you are)
    > is much clearer because there's less noise (the quotes all over the
    > place and the plusses)

    I don't find it that much clearer unless there are a lot of chunks.
    > and it's better for dealing with internationalization if you need to
    > do that.

    I hadn't thought of that. That's probably the best reason to use string
    formatting.


    Thanks, everyone.
    Andrew Berg, Jul 9, 2011
    #4
  5. * John Gordon (Fri, 8 Jul 2011 20:23:52 +0000 (UTC))
    > I prefer this usage:
    >
    > logger.error('%s could not be stored - %s' % \
    > (self.preset_file, sys.exc_info()[1]))


    The syntax for formatting logging messages according to the
    documentation is:

    Logger.error(msg, *args)

    NOT

    Logger.error(msg % (*args))

    Thorsten
    Thorsten Kampe, Jul 9, 2011
    #5
  6. Andrew Berg wrote:

    > Is it bad practice to use this
    >> logger.error(self.preset_file + ' could not be stored - ' +
    >> sys.exc_info()[1])

    > Instead of this?
    >> logger.error('{file} could not be stored -
    >> {error}'.format(file=self.preset_file, error=sys.exc_info()[1]))

    >
    >
    > Other than the case where a variable isn't a string (format() converts
    > variables to strings, automatically, right?)


    Not exactly, but more or less. format() has type codes, just like % string
    interpolation:


    >>> '{0:d}'.format(1)

    '1'
    >>> '{0:d}'.format(None)

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    ValueError: Unknown format code 'd' for object of type 'str'

    >>> '%d' % 1

    '1'
    >>> '%d' % None

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: %d format: a number is required, not NoneType

    If you don't give a type code, format converts any object to string (if
    possible).


    > and when a variable is used
    > a bunch of times, concatenation is fine, but somehow, it seems wrong.


    I don't like long chains of string concatenation, but short chains seem okay
    to me. One or two plus signs seems fine to my eyes, three at the most. Any
    more than that, I'd look at replacing it with % interpolation, the
    str.join() idiom, the string.Template class, or str.format.

    That's five ways of building strings.

    Of course, *repeated* string concatenation risks being slow -- not just a
    little slow, but potentially MASSIVELY slow, hundreds or thousands of times
    slower that alternatives. Fortunately recent versions of CPython tend to
    avoid this (which makes it all the more mysterious when the slow-down does
    strike), but other Pythons like Jython and IronPython may not. So it's best
    to limit string concatenation to one or two strings.

    And finally, if you're concatenating string literals, you can use implicit
    concatenation (*six* ways):

    >>> s = ("hello "

    .... "world"
    .... "!")
    >>> s

    'hello world!'


    > Sorry if this seems a bit silly, but I'm a novice when it comes to
    > design. Plus, there's not really supposed to be "more than one way to do
    > it" in Python.


    On the contrary -- there are many different examples of "more than one way
    to do it". The claim that Python has "only one way" to do things comes from
    the Perl community, and is wrong.

    It is true that Python doesn't deliberately add multiple ways of doing
    things just for the sake of being different, or because they're cool,
    although of course that's a subjective judgement. (Some people think that
    functional programming idioms such as map and filter fall into that
    category, wrongly in my opinion.) In any case, it's clear that Python
    supports many ways of doing "the same thing", not all of which are exactly
    equivalent:

    # e.g. copy a list
    blist = list(alist)
    blist = alist[:]
    blist[:] = alist # assumes blist already exists
    blist = copy.copy(alist)
    blist = copy.deepcopy(alist)
    blist = []; blist.extend(alist)
    blist = [x for x in alist] # don't do this


    Hardly "only one way" :)



    --
    Steven
    Steven D'Aprano, Jul 9, 2011
    #6
  7. Billy Mays wrote:

    > If it means anything, I think concatenation is faster.


    You are measuring the speed of an implementation-specific optimization.
    You'll likely get *very* different results with Jython or IronPython, or
    old versions of CPython, or even if you use instance attributes instead of
    local variables.

    It also doesn't generalise: only appends are optimized, not prepends.

    Worse, the optimization can be defeated by accidents of your operating
    system's memory management, so code that runs snappily and fast on one
    machine will run SLLLOOOOOOWWWWWWWWWWWWWWWLY on another.

    This is not a hypothetical risk. People have been burned by this in real
    life:

    http://www.gossamer-threads.com/lists/python/dev/771948

    If you're interested in learning about the optimization:

    http://utcc.utoronto.ca/~cks/space/blog/python/ExaminingStringConcatOpt




    --
    Steven
    Steven D'Aprano, Jul 9, 2011
    #7
  8. On Sat, Jul 9, 2011 at 3:30 PM, Steven D'Aprano
    <> wrote:
    > It also doesn't generalise: only appends are optimized, not prepends.
    >
    > If you're interested in learning about the optimization:
    >
    > http://utcc.utoronto.ca/~cks/space/blog/python/ExaminingStringConcatOpt


    >From that page:

    "Also, this is only for plain (byte) strings, not for Unicode strings;
    as of Python 2.4.2, Unicode string concatenation remains
    un-optimized."

    Has the same optimization been implemented for Unicode? The page
    doesn't mention Python 3 at all, and I would guess that the realloc
    optimization would work fine for both types of string.

    ChrisA
    Chris Angelico, Jul 9, 2011
    #8
  9. Andrew Berg

    Ian Kelly Guest

    On Sat, Jul 9, 2011 at 12:16 AM, Chris Angelico <> wrote:
    > Has the same optimization been implemented for Unicode? The page
    > doesn't mention Python 3 at all, and I would guess that the realloc
    > optimization would work fine for both types of string.


    Seems to be implemented for strs in 3.2, but not unicode in 2.7.
    Ian Kelly, Jul 9, 2011
    #9
  10. Andrew Berg

    Andrew Berg Guest

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: RIPEMD160

    How should I go about switching from concatenation to string formatting
    for this?

    avs.write(demux_filter + field_filter + fpsin_filter + i2pfilter +
    dn_filter + fpsout_filter + trim_filter + info_filter)

    I can think of a few ways, but none of them are pretty.

    - --
    CPython 3.2 | Windows NT 6.1.7601.17592 | Thunderbird 5.0
    PGP/GPG Public Key ID: 0xF88E034060A78FCB
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.11 (MingW32)
    Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

    iQEcBAEBAwAGBQJOGUp7AAoJEPiOA0Bgp4/L3koIAMntYStREGjww6yKGKE/xI0W
    ecAg2BHdqBxTFsPT6NMrSRyrNbdfnWRQcRi/0Z+Hhbwqp4qsz5hDFgsoVPkT5gyj
    6q0TeJqaSE+Uoj5g2BofqVlWydyQ7fW34KaANbj7V71/UqXXgb+fl8TYvVRJbg0A
    KlfytOO0HBrDW8f6dzGZuxLxCb3EONt7buIUV3Pa7b9jQZNTTiOKktLtWAteMMiC
    CHivQhqzB8/cNVddpyk5LaMEDzJ9yz8a83fjuK8F5E/wrYk22t6Fad6PKgDEivaj
    hAiE5HMeUw+gQ7xFhJGkK31/KyHRqAaFR4mUh16u9GHMTaGPobk8NEj81LwCbvg=
    =g3kL
    -----END PGP SIGNATURE-----
    Andrew Berg, Jul 10, 2011
    #10
  11. Andrew Berg wrote:

    > How should I go about switching from concatenation to string formatting
    > for this?
    >
    > avs.write(demux_filter + field_filter + fpsin_filter + i2pfilter +
    > dn_filter + fpsout_filter + trim_filter + info_filter)
    >
    > I can think of a few ways, but none of them are pretty.


    fields = (demux_filter, field_filter, fpsin_filter, i2pfilter,
    dn_filter, fpsout_filter, trim_filter, info_filter)
    avs.write("%s"*len(fields) % fields)

    works for me.




    --
    Steven
    Steven D'Aprano, Jul 10, 2011
    #11
  12. Andrew Berg

    Roy Smith Guest

    In article <>,
    Andrew Berg <> wrote:

    > How should I go about switching from concatenation to string formatting
    > for this?
    >
    > avs.write(demux_filter + field_filter + fpsin_filter + i2pfilter +
    > dn_filter + fpsout_filter + trim_filter + info_filter)
    >
    > I can think of a few ways, but none of them are pretty.


    The canonical way to do that would be something like

    fields = [demux_filter,
    field_filter,
    fpsin_filter,
    i2pfilter,
    dn_filter,
    fpsout_filter,
    trim_filter,
    info_filter]
    avs.write(''.join(fields))
    Roy Smith, Jul 10, 2011
    #12
  13. Roy Smith wrote:

    > The canonical way to do that would be something like
    >
    > fields = [demux_filter,
    > field_filter,
    > fpsin_filter,
    > i2pfilter,
    > dn_filter,
    > fpsout_filter,
    > trim_filter,
    > info_filter]
    > avs.write(''.join(fields))


    I can't believe I didn't think of that. I must be getting sick. (The sore
    throat, stuffy nose and puffy eyes may also be a sign.)

    Yes, ''.join() is far to be preferred over my solution using "%s".



    --
    Steven
    Steven D'Aprano, Jul 10, 2011
    #13
  14. Andrew Berg

    Andrew Berg Guest

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: RIPEMD160

    On 2011.07.10 09:33 AM, Roy Smith wrote:
    > The canonical way to do that would be something like
    >
    > fields = [demux_filter, field_filter, fpsin_filter, i2pfilter,
    > dn_filter, fpsout_filter, trim_filter, info_filter]
    > avs.write(''.join(fields))

    That would look really awful (IMO) if the strings weren't intended to be
    on separate lines (I use embedded newlines instead of joining them with
    newlines in order to prevent blank lines whenever a certain filter isn't
    used). In this particular case, they are, so even though it uses a lot
    of whitespace, it does match the layout of its output.

    - --
    CPython 3.2 | Windows NT 6.1.7601.17592 | Thunderbird 5.0
    PGP/GPG Public Key ID: 0xF88E034060A78FCB
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.11 (MingW32)
    Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

    iQEcBAEBAwAGBQJOGjSxAAoJEPiOA0Bgp4/LHzcH+gKeSCkbdEh8jg2UV0vICJdS
    Fea95/vqCbZkjQxSuW8L73CpoACiv4XQ6hoxyIUq7maf+W89rGMVmLsPWYXtmif9
    FV6WM3kSpg4hoC1cbqGW5g1bnpMnSPlznm74mKtdGhF+3zEtlm9+j8m53362YQHc
    0Y9D+4KAeee5QUT/NII5QBRvSG2rAuv5+sayMNayix0pCJLEGrRLp/7LJOyhvJLN
    eDdywE+svfcQAi4iGAylrmvDfgf6pBgysyY/pv2YD9IpdpYL5mkVqLi+ADZdZBOb
    M4uxBReowgC/RaWxB+qEvfg5AxWmfg4uCtAl48Z/Jv/uYR9d9jeHAlbuV2xPfnk=
    =wRB5
    -----END PGP SIGNATURE-----
    Andrew Berg, Jul 11, 2011
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. walala
    Replies:
    3
    Views:
    4,781
    walala
    Sep 18, 2003
  2. Sukhbir Dhillon
    Replies:
    1
    Views:
    6,231
    Joe Smith
    Apr 5, 2004
  3. Daniel Bergquist

    String Concatenation problems

    Daniel Bergquist, Jul 13, 2004, in forum: Perl
    Replies:
    2
    Views:
    482
    Joe Smith
    Jul 16, 2004
  4. Sparky Arbuckle

    String Concatenation & Removing Space

    Sparky Arbuckle, Sep 1, 2005, in forum: ASP .Net
    Replies:
    5
    Views:
    601
    Sparky Arbuckle
    Sep 1, 2005
  5. Darren
    Replies:
    5
    Views:
    4,452
    Darren
    Jul 28, 2004
Loading...

Share This Page