Opposite of split

Discussion in 'Python' started by Alex van der Spek, Aug 15, 2010.

  1. Looking for a method that does the opposite of 'split', i.e. elements in a
    list are automatically concatenated with a user selectable spacer in between
    e.g. '\t'. This is to prepare lines to be written to a sequential file by
    'write'.

    All hints welcome.

    Regards,
    Alex van der Spek
     
    Alex van der Spek, Aug 15, 2010
    #1
    1. Advertising

  2. On 15.08.2010 20:24, Alex van der Spek wrote:
    > Looking for a method that does the opposite of 'split', i.e. elements in
    > a list are automatically concatenated with a user selectable spacer in
    > between e.g. '\t'.
    >>> " ".join(["i","am","a","list"])

    'i am a list'

    Wieland
     
    Wieland Hoffmann, Aug 15, 2010
    #2
    1. Advertising

  3. Alex van der Spek

    Gary Herron Guest

    On 08/15/2010 11:24 AM, Alex van der Spek wrote:
    > Looking for a method that does the opposite of 'split', i.e. elements
    > in a list are automatically concatenated with a user selectable spacer
    > in between e.g. '\t'. This is to prepare lines to be written to a
    > sequential file by 'write'.
    >
    > All hints welcome.
    >
    > Regards,
    > Alex van der Spek


    Strings have a join method for this:
    '\t'.join(someList)

    Gary Herron
     
    Gary Herron, Aug 15, 2010
    #3
  4. Alex van der Spek

    Steven Howe Guest

    On 08/15/2010 11:35 AM, Gary Herron wrote:
    > On 08/15/2010 11:24 AM, Alex van der Spek wrote:
    >> Looking for a method that does the opposite of 'split', i.e. elements
    >> in a list are automatically concatenated with a user selectable
    >> spacer in between e.g. '\t'. This is to prepare lines to be written
    >> to a sequential file by 'write'.
    >>
    >> All hints welcome.
    >>
    >> Regards,
    >> Alex van der Spek

    >
    > Strings have a join method for this:
    > '\t'.join(someList)
    >
    > Gary Herron

    or maybe:
    -----------------------------------------
    res = ""
    for item in myList:
    res = "%s\t%s" % ( res, item )

    myList = ["abc","def","hjk"]
    res = ""
    for item in myList:
    res = "%s\t%s" % ( res, item )
    res
    '\tabc\tdef\thjk'


    print res
    abc def hjk

    Note the leading tab.
    -----------------------------------------
    So:
    >>> res.strip()

    'abc\tdef\thjk'
    >>> print res.strip()

    abc def hjk

    simple enough. Strange you had to ask.

    sph
     
    Steven Howe, Aug 15, 2010
    #4
  5. On Sun, 15 Aug 2010 12:10:10 -0700, Steven Howe wrote:

    >> Strings have a join method for this:
    >> '\t'.join(someList)
    >>
    >> Gary Herron

    > or maybe:
    > -----------------------------------------
    > res = ""
    > for item in myList:
    > res = "%s\t%s" % ( res, item )



    Under what possible circumstances would you prefer this code to the built-
    in str.join method?

    Particularly since the code isn't even correct, as it adds a spurious tab
    character at the beginning of the result string.

    (By the way, your solution, to call res.strip(), is incorrect, as it
    removes too much.)


    --
    Steven
     
    Steven D'Aprano, Aug 16, 2010
    #5
  6. Alex van der Spek

    Roy Smith Guest

    In article <4c687936$0$11100$>,
    Steven D'Aprano <> wrote:

    > On Sun, 15 Aug 2010 12:10:10 -0700, Steven Howe wrote:
    >
    > >> Strings have a join method for this:
    > >> '\t'.join(someList)
    > >>
    > >> Gary Herron

    > > or maybe:
    > > -----------------------------------------
    > > res = ""
    > > for item in myList:
    > > res = "%s\t%s" % ( res, item )

    >
    >
    > Under what possible circumstances would you prefer this code to the built-
    > in str.join method?
    >
    > Particularly since the code isn't even correct, as it adds a spurious tab
    > character at the beginning of the result string.


    I think you answered your own question. The possible circumstance would
    be a "find the bug" question on a programming interview :) Actually,
    there is (at least) one situation where this produces the correct
    result, can you find it?

    The other problem is that the verbose version is O(n^2) and str.join()
    is O(n).
     
    Roy Smith, Aug 16, 2010
    #6
  7. On 15 Aug 2010 23:33:10 GMT
    Steven D'Aprano <> wrote:
    > Under what possible circumstances would you prefer this code to the built-
    > in str.join method?


    I assumed that it was a trap for someone asking for us to do his
    homework. I also thought that it was a waste of time because I knew
    that twenty people would jump in with the correct answer because of
    "finally, one that I can answer" syndrome.

    --
    D'Arcy J.M. Cain <> | Democracy is three wolves
    http://www.druid.net/darcy/ | and a sheep voting on
    +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
     
    D'Arcy J.M. Cain, Aug 16, 2010
    #7
  8. On Sun, 15 Aug 2010 19:58:54 -0400, Roy Smith wrote:

    > Actually,
    > there is (at least) one situation where this produces the correct
    > result, can you find it?


    When myList is empty, it correctly gives the empty string.



    --
    Steven
     
    Steven D'Aprano, Aug 16, 2010
    #8
  9. Thanks much,

    Nope, no homework. This was a serious question from a serious but perhaps
    simple physicist who grew up with Algol, FORTRAN and Pascal, taught himself
    VB(A) and is looking for a replacement of VB and finding that in Python. You
    can guess my age now.

    Most of my work I do in R nowadays but R is not flexible enough for some
    file manipulation operations. I use the book by Lutz ("Learning Python").
    The join method for strings is in there. I did not have the book at hand and
    I was jetlagged too. I do apologize for asking a simple question.

    I had no idea that some would go to the extent of giving trick solutions for
    simple, supposedly homework questions. Bear in mind Python is a very feature
    rich language. You cannot expect all newbies to remember everything.

    By the way, I had a working program that did what I wanted using still
    simpler string concatenation. Replaced that now by tab.join([lines[k][2]
    for i in range(5)]), k being a loop counter. Judge for yourself. That is the
    level I am at after 6 weeks of doing excercises from my programming book on
    Pascal in Python.
    Thanks for the help. I do hope there is no entry level for using this group.
    If there is, I won't meet it for a while.
    Alex van der Spek

    "D'Arcy J.M. Cain" <> wrote in message
    news:...
    > On 15 Aug 2010 23:33:10 GMT
    > Steven D'Aprano <> wrote:
    >> Under what possible circumstances would you prefer this code to the
    >> built-
    >> in str.join method?

    >
    > I assumed that it was a trap for someone asking for us to do his
    > homework. I also thought that it was a waste of time because I knew
    > that twenty people would jump in with the correct answer because of
    > "finally, one that I can answer" syndrome.
    >
    > --
    > D'Arcy J.M. Cain <> | Democracy is three wolves
    > http://www.druid.net/darcy/ | and a sheep voting on
    > +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
     
    Alex van der Spek, Aug 16, 2010
    #9
  10. Perhaps the ones here who think I was trying to make you do my homework can
    actually help me for real. Since I run my own company (not working for any
    of the big ones) I can't afford official training in anything. So I teach
    myself, help is always welcome and sought for. If that feels like doing
    homework for me, so be it.

    The fact is that I do try to learn Python. It can do things I thought
    required much more coding. Look at the attached. It builds a concordance
    table first. That was an excercise from a book on Pascal programming. In
    Pascal the solution is 2 pages of code. In Python it is 8 lines. Beautiful!

    Anybody catches any other ways to improve my program (attached), you are
    most welcome. Help me learn, that is one of the objectives of this
    newsgroup, right? Or is it all about exchanging the next to impossible
    solution to the never to happen unreal world problems?

    Regards,
    Alex van der Spek


    "D'Arcy J.M. Cain" <> wrote in message
    news:...
    > On 15 Aug 2010 23:33:10 GMT
    > Steven D'Aprano <> wrote:
    >> Under what possible circumstances would you prefer this code to the
    >> built-
    >> in str.join method?

    >
    > I assumed that it was a trap for someone asking for us to do his
    > homework. I also thought that it was a waste of time because I knew
    > that twenty people would jump in with the correct answer because of
    > "finally, one that I can answer" syndrome.
    >
    > --
    > D'Arcy J.M. Cain <> | Democracy is three wolves
    > http://www.druid.net/darcy/ | and a sheep voting on
    > +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
     
    Alex van der Spek, Aug 16, 2010
    #10
  11. On Mon, 16 Aug 2010 18:26:46 +0200
    "Alex van der Spek" <> wrote:
    > Nope, no homework. This was a serious question from a serious but perhaps
    > simple physicist who grew up with Algol, FORTRAN and Pascal, taught himself
    > VB(A) and is looking for a replacement of VB and finding that in Python. You
    > can guess my age now.
    >
    > Most of my work I do in R nowadays but R is not flexible enough for some
    > file manipulation operations. I use the book by Lutz ("Learning Python").
    > The join method for strings is in there. I did not have the book at hand and
    > I was jetlagged too. I do apologize for asking a simple question.


    I'm not actually the one that presented the convuluted example. I
    think the one who did just felt that someone had a question and they
    were passing it to the group instead of doing a simple Google search.
    The "solution" he posted looked like something designed to make the
    teacher scratch his head and ask embarrassing questions of the student.

    > Thanks for the help. I do hope there is no entry level for using this group.
    > If there is, I won't meet it for a while.


    I think that the only thing people expect is that you do a quick search
    first and show that you have tried first. Some questions have been
    asked and answered so many times that a search of the archives finds
    what you want without waiting for an answer.

    --
    D'Arcy J.M. Cain <> | Democracy is three wolves
    http://www.druid.net/darcy/ | and a sheep voting on
    +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
     
    D'Arcy J.M. Cain, Aug 16, 2010
    #11
  12. On Mon, 16 Aug 2010 18:44:08 +0200
    "Alex van der Spek" <> wrote:
    > Perhaps the ones here who think I was trying to make you do my homework can


    You keep replying to my message but as I pointed out in my previous
    message, I'm not the one who thought that you posted a homework
    question. I'm the one who thought that the other poster thought that
    you posted a homework question. Honestly, while I thought it was a
    question that could have been answered faster with a Google search, it
    did not look like a homework question to me.

    > actually help me for real. Since I run my own company (not working for any
    > of the big ones) I can't afford official training in anything. So I teach
    > myself, help is always welcome and sought for. If that feels like doing
    > homework for me, so be it.


    Well, it is "home" work but there is nothing wrong with asking for help
    anyway. When people complain about homework questions it is generally
    because someone has posted the question verbatim from the assignment
    and asks for a complete solution. That's annoying. What you have done
    here is good because you show some work and ask for help with it.
    Slightly better would be to ask specific questions about areas that you
    are struggling with but this is good.

    > The fact is that I do try to learn Python. It can do things I thought
    > required much more coding. Look at the attached. It builds a concordance
    > table first. That was an excercise from a book on Pascal programming. In
    > Pascal the solution is 2 pages of code. In Python it is 8 lines. Beautiful!


    I guess the real entry level test here is that you have to be smart
    enough to choose Python since it is the best language. You pass. :)

    --
    D'Arcy J.M. Cain <> | Democracy is three wolves
    http://www.druid.net/darcy/ | and a sheep voting on
    +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
     
    D'Arcy J.M. Cain, Aug 16, 2010
    #12
  13. Alex van der Spek

    John Posner Guest

    On 8/16/2010 12:44 PM, Alex van der Spek wrote:
    >
    > Anybody catches any other ways to improve my program (attached), you are
    > most welcome.


    1. You don't need to separate out special characters (TABs, NEWLINEs,
    etc.) in a string. So:

    bt='-999.25'+'\t''-999.25'+'\t''-999.25'+'\t''-999.25'+'\t'+'-999.25'

    .... can be ...

    bt='-999.25\t-999.25\t-999.25\t-999.25\t-999.25'

    BTW, I think you made a couple of "lucky errors" in this statement.
    Where there are two consecutive apostrophe (') characters, did you mean
    to put a plus sign in between? Your statement is valid because the
    Python interpreter concatenates strings for you:

    >>> x = 'foo''bar'
    >>> x == 'foobar'

    True

    >>> x = 'foo' 'bar'
    >>> x == 'foobar'

    True


    2. Take a look at the functions in the os.path module:

    http://docs.python.org/library/os.path.html

    These functions might simplify your pathname manipulations. (I didn't
    look closely enough to know for sure.)

    3. An alternative to:

    alf.write(tp+'\t'+vf+'\t'+vq+'\t'+al+'\t'+bt+'\t'+vs+'\n')

    ... is ...

    alf.write("\t".join((tp, vf, vq, al, bt, vs)) + "\n")

    4. I suggest using a helper function to bring that super-long
    column-heading line (alf.write('Timestamp ...) under control:

    def multi_field_names(base_name, count, sep_string):
    names = [base_name + " " + str(i) for i in range(1, count+1)]
    return sep_string.join(names)

    HTH,
    John
     
    John Posner, Aug 16, 2010
    #13
  14. Hi Alex,

    On 2010-08-16 18:44, Alex van der Spek wrote:
    > Anybody catches any other ways to improve my program (attached), you are
    > most welcome. Help me learn, that is one of the objectives of this
    > newsgroup, right? Or is it all about exchanging the next to impossible
    > solution to the never to happen unreal world problems?


    I don't know what a concordance table is, and I haven't
    looked a lot into your program, but anyway here are some
    things I noticed at a glance:

    | #! usr/bin/env python
    | # Merge log files to autolog file
    | import os
    | import fileinput
    | #top='C:\\Documents and Settings\\avanderspek\\My Documents\\CiDRAdata\\Syncrude\\CSL\\August2010'
    | top='C:\\Users\\ZDoor\\Documents\\CiDRA\\Syncrude\CSL\\August2010'

    If you have backslashes in strings, you might want to use
    "raw strings". Instead of "c:\\Users\\ZDoor" you'd write
    r"c:\Users\ZDoor" (notice the r in front of the string).

    | i,j,k=0,0,0
    | date={}

    I suggest to use more spacing to make the code more
    readable. Have a look at

    http://www.python.org/dev/peps/pep-0008/

    for more formatting (and other) tips.

    | fps=0.3048
    | tab='\t'
    |
    | bt='-999.25'+'\t''-999.25'+'\t''-999.25'+'\t''-999.25'+'\t'+'-999.25'

    If these numbers are always the same, you should use
    something like

    NUMBER = "-999.25"
    COLUMNS = 5
    bt = "\t".join(COLUMNS * [NUMBER])

    (with better naming, of course).

    Why don't you use `tab` here?

    I _highly_ recommend to use longer (unabbreviated) names.

    | al='Status'+'\t'+'State'+'\t'+'-999.25'
    |
    | for root,dirs,files in os.walk(top):
    | #Build a concordance table of days on which data was collected
    | for name in files:
    | ext=name.split('.',1)[1]

    There's a function `splitext` in `os.path`.

    | if ext=='txt':
    | dat=name.split('_')[1].split('y')[1]
    | if dat in date.keys():

    You can just write `if dat in date` (in Python versions >=
    2.2, I think).

    | date[dat]+=1
    | else:
    | date[dat]=1
    | print 'Concordance table of days:'
    | print date
    | print 'List of files processed:'
    | #Build a list of same day filenames, 5 max for a profile meter,skip first and last days
    | for f in sorted(date.keys())[2:-1]:
    | logs=[]
    | for name in files:
    | ext=name.split('.')[1]
    | if ext=='txt':
    | dat=name.split('_')[1].split('y')[1]

    I guess I'd move the parsing stuff (`x.split(s)` etc.)
    into small functions with meaningful names. After that I'd
    probably notice there's much redundancy and refactor them. ;)

    | if dat==f:
    | logs.append(os.path.join(root,name))
    | #Open the files and read line by line
    | datsec=False
    | lines=[[] for i in range(5)]

    One thing to watch out for: The above is different from
    `[[]] * 5` which uses the _same_ empty list for all entries.
    Probably the semantics you chose is correct.

    | fn=0
    | for line in fileinput.input(logs):
    | if line.split()[0]=='DataID':
    | datsec=True
    | ln=0
    | if datsec:
    | lines[fn].append(line.split())
    | ln+=1
    | if ln==10255:

    This looks like a "magic number" and should be turned into a
    constant.

    | datsec=False
    | fileinput.nextfile()
    | fn+=1
    | print fileinput.filename().rsplit('\\',1)[1]
    | fileinput.close()
    | aut='000_AutoLog'+f+'.log'
    | out=os.path.join(root,aut)
    | alf=open(out,'w')
    | alf.write('Timestamp (mm/dd/yyyy hh:mm:ss) VF 1 VF 2 VF 3 VF 4 VF 5 Q 1 Q 2 Q 3 Q 4 Q 5 Status State Metric Band Temperature 1 Band Temperature 2 Band Temperature 3 Band Temperature 4 Band Temperature 5 SPL 1 SPL 2 SPL 3 SPL 4 SPL 5'+'\n')
    | for wn in range(1,10255,1):

    You don't need to write the step argument if it's 1.

    | for i in range(5):
    | lines[wn][2]=str(float(lines[wn][2])/fps)
    | tp=lines[0][wn][0]+' '+lines[0][wn][1]
    | vf=tab.join([lines[wn][2] for i in range(5)])
    | vq=tab.join([lines[wn][3] for i in range(5)])
    | vs=tab.join([lines[wn][4] for i in range(5)])
    | #sf=tab.join([lines[wn][5] for i in range(5)])
    | #sq=tab.join([lines[wn][6] for i in range(5)])
    | #ss=tab.join([lines[wn][7] for i in range(5)])

    Maybe use an extra function?

    def choose_a_better_name():
    return tab.join([lines[index][wn][2] for index in range(5)])

    Moreover, the repetition of this line looks as if you wanted
    to put the right hand sides of the assignments in a list,
    instead of assigning to distinct names (`vf` etc.).

    By the way, you use the number 5 a lot. I guess this should
    be a constant, too.

    | alf.write(tp+'\t'+vf+'\t'+vq+'\t'+al+'\t'+bt+'\t'+vs+'\n')

    Suggestion: Use

    tab.join([tp, vf, vq, al, bt, vs]) + "\n"

    Again, not using distinct variables would have an advantage
    here.

    | alf.close()
    | print "Done"

    Stefan
     
    Stefan Schwarzer, Aug 17, 2010
    #14
  15. Alex van der Spek

    Neil Cerutti Guest

    On 2010-08-17, Stefan Schwarzer <> wrote:
    > Hi Alex,
    >
    > On 2010-08-16 18:44, Alex van der Spek wrote:
    >> Anybody catches any other ways to improve my program (attached), you are
    >> most welcome. Help me learn, that is one of the objectives of this
    >> newsgroup, right? Or is it all about exchanging the next to impossible
    >> solution to the never to happen unreal world problems?

    >
    > I don't know what a concordance table is, and I haven't
    > looked a lot into your program, but anyway here are some
    > things I noticed at a glance:
    >
    >| #! usr/bin/env python
    >| # Merge log files to autolog file
    >| import os
    >| import fileinput
    >| #top='C:\\Documents and Settings\\avanderspek\\My Documents\\CiDRAdata\\Syncrude\\CSL\\August2010'
    >| top='C:\\Users\\ZDoor\\Documents\\CiDRA\\Syncrude\CSL\\August2010'
    >
    > If you have backslashes in strings, you might want to use "raw
    > strings". Instead of "c:\\Users\\ZDoor" you'd write
    > r"c:\Users\ZDoor" (notice the r in front of the string).


    That's good general advice. But in the specific case of file
    paths, using '/' as the separator is supported, and somewhat
    preferable.

    --
    Neil Cerutti
     
    Neil Cerutti, Aug 17, 2010
    #15
  16. On 2010-08-17, Neil Cerutti <> wrote:
    > On 2010-08-17, Stefan Schwarzer <> wrote:
    >> Hi Alex,
    >>
    >> On 2010-08-16 18:44, Alex van der Spek wrote:
    >>> Anybody catches any other ways to improve my program (attached), you are
    >>> most welcome. Help me learn, that is one of the objectives of this
    >>> newsgroup, right? Or is it all about exchanging the next to impossible
    >>> solution to the never to happen unreal world problems?

    >>
    >> I don't know what a concordance table is, and I haven't
    >> looked a lot into your program, but anyway here are some
    >> things I noticed at a glance:
    >>
    >>| #! usr/bin/env python
    >>| # Merge log files to autolog file
    >>| import os
    >>| import fileinput
    >>| #top='C:\\Documents and Settings\\avanderspek\\My Documents\\CiDRAdata\\Syncrude\\CSL\\August2010'
    >>| top='C:\\Users\\ZDoor\\Documents\\CiDRA\\Syncrude\CSL\\August2010'
    >>
    >> If you have backslashes in strings, you might want to use "raw
    >> strings". Instead of "c:\\Users\\ZDoor" you'd write
    >> r"c:\Users\ZDoor" (notice the r in front of the string).

    >
    > That's good general advice. But in the specific case of file
    > paths, using '/' as the separator is supported, and somewhat
    > preferable.


    Unless you're going to be passing them to cmd.exe or other utilities
    via subprocess/popen.

    --
    Grant Edwards grant.b.edwards Yow! MY income is ALL
    at disposable!
    gmail.com
     
    Grant Edwards, Aug 17, 2010
    #16
  17. Alex van der Spek

    News123 Guest

    On 08/17/2010 05:46 PM, Grant Edwards wrote:
    > On 2010-08-17, Neil Cerutti <> wrote:
    >> On 2010-08-17, Stefan Schwarzer <> wrote:
    >>> Hi Alex,
    >>>
    >>> On 2010-08-16 18:44, Alex van der Spek wrote:
    >>>> Anybody catches any other ways to improve my program (attached), you are
    >>>> most welcome. Help me learn, that is one of the objectives of this
    >>>> newsgroup, right? Or is it all about exchanging the next to impossible
    >>>> solution to the never to happen unreal world problems?
    >>>
    >>> I don't know what a concordance table is, and I haven't
    >>> looked a lot into your program, but anyway here are some
    >>> things I noticed at a glance:
    >>>
    >>> | #! usr/bin/env python
    >>> | # Merge log files to autolog file
    >>> | import os
    >>> | import fileinput
    >>> | #top='C:\\Documents and Settings\\avanderspek\\My Documents\\CiDRAdata\\Syncrude\\CSL\\August2010'
    >>> | top='C:\\Users\\ZDoor\\Documents\\CiDRA\\Syncrude\CSL\\August2010'
    >>>
    >>> If you have backslashes in strings, you might want to use "raw
    >>> strings". Instead of "c:\\Users\\ZDoor" you'd write
    >>> r"c:\Users\ZDoor" (notice the r in front of the string).

    >>
    >> That's good general advice. But in the specific case of file
    >> paths, using '/' as the separator is supported, and somewhat
    >> preferable.

    >
    > Unless you're going to be passing them to cmd.exe or other utilities
    > via subprocess/popen.
    >

    in that case you could use os.path.normpath() prior to passing it to an
    external program und use slashies internally.


    A little less performant, but in my opinion nicer typing.
     
    News123, Aug 17, 2010
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. darrel
    Replies:
    5
    Views:
    8,047
    Kevin Spencer
    May 9, 2005
  2. Arnold Peters
    Replies:
    0
    Views:
    387
    Arnold Peters
    Oct 16, 2004
  3. CB
    Replies:
    9
    Views:
    563
  4. Replies:
    1
    Views:
    837
    Robin Becker
    Apr 29, 2006
  5. Gunter Hansen
    Replies:
    5
    Views:
    957
    Roedy Green
    Sep 1, 2011
Loading...

Share This Page