Opposite of split

Alex van der Spek · Aug 15, 2010

Looking for a method that does the opposite of 'split', i.e. elements in a
list are automatically concatenated with a user selectable spacer in between
e.g. '\t'. This is to prepare lines to be written to a sequential file by
'write'.

All hints welcome.

Regards,
Alex van der Spek

Wieland Hoffmann · Aug 15, 2010

Looking for a method that does the opposite of 'split', i.e. elements in
a list are automatically concatenated with a user selectable spacer in
between e.g. '\t'.

" ".join(["i","am","a","list"])

Click to expand...

Click to expand...

'i am a list'

Wieland

Gary Herron · Aug 15, 2010

Looking for a method that does the opposite of 'split', i.e. elements
in a list are automatically concatenated with a user selectable spacer
in between e.g. '\t'. This is to prepare lines to be written to a
sequential file by 'write'.

All hints welcome.

Regards,
Alex van der Spek

Strings have a join method for this:
'\t'.join(someList)

Gary Herron

Steven Howe · Aug 15, 2010

Strings have a join method for this:
'\t'.join(someList)

Gary Herron

or maybe:
-----------------------------------------
res = ""
for item in myList:
res = "%s\t%s" % ( res, item )

myList = ["abc","def","hjk"]
res = ""
for item in myList:
res = "%s\t%s" % ( res, item )
res
'\tabc\tdef\thjk'

print res
abc def hjk

Note the leading tab.
-----------------------------------------
So:abc def hjk

simple enough. Strange you had to ask.

sph

Steven D'Aprano · Aug 15, 2010

or maybe:

Under what possible circumstances would you prefer this code to the built-
in str.join method?

Particularly since the code isn't even correct, as it adds a spurious tab
character at the beginning of the result string.

(By the way, your solution, to call res.strip(), is incorrect, as it
removes too much.)

Roy Smith · Aug 15, 2010

Steven D'Aprano said:
Under what possible circumstances would you prefer this code to the built-
in str.join method?

Particularly since the code isn't even correct, as it adds a spurious tab
character at the beginning of the result string.

I think you answered your own question. The possible circumstance would
be a "find the bug" question on a programming interview

Actually,
there is (at least) one situation where this produces the correct
result, can you find it?

The other problem is that the verbose version is O(n^2) and str.join()
is O(n).

D'Arcy J.M. Cain · Aug 15, 2010

Under what possible circumstances would you prefer this code to the built-
in str.join method?

I assumed that it was a trap for someone asking for us to do his
homework. I also thought that it was a waste of time because I knew
that twenty people would jump in with the correct answer because of
"finally, one that I can answer" syndrome.

Steven D'Aprano · Aug 15, 2010

Actually,
there is (at least) one situation where this produces the correct
result, can you find it?

When myList is empty, it correctly gives the empty string.

Alex van der Spek · Aug 16, 2010

Thanks much,

Nope, no homework. This was a serious question from a serious but perhaps
simple physicist who grew up with Algol, FORTRAN and Pascal, taught himself
VB(A) and is looking for a replacement of VB and finding that in Python. You
can guess my age now.

Most of my work I do in R nowadays but R is not flexible enough for some
file manipulation operations. I use the book by Lutz ("Learning Python").
The join method for strings is in there. I did not have the book at hand and
I was jetlagged too. I do apologize for asking a simple question.

I had no idea that some would go to the extent of giving trick solutions for
simple, supposedly homework questions. Bear in mind Python is a very feature
rich language. You cannot expect all newbies to remember everything.

By the way, I had a working program that did what I wanted using still
simpler string concatenation. Replaced that now by tab.join([lines[k][2]
for i in range(5)]), k being a loop counter. Judge for yourself. That is the
level I am at after 6 weeks of doing excercises from my programming book on
Pascal in Python.
Thanks for the help. I do hope there is no entry level for using this group.
If there is, I won't meet it for a while.
Alex van der Spek

Alex van der Spek · Aug 16, 2010

Perhaps the ones here who think I was trying to make you do my homework can
actually help me for real. Since I run my own company (not working for any
of the big ones) I can't afford official training in anything. So I teach
myself, help is always welcome and sought for. If that feels like doing
homework for me, so be it.

The fact is that I do try to learn Python. It can do things I thought
required much more coding. Look at the attached. It builds a concordance
table first. That was an excercise from a book on Pascal programming. In
Pascal the solution is 2 pages of code. In Python it is 8 lines. Beautiful!

Anybody catches any other ways to improve my program (attached), you are
most welcome. Help me learn, that is one of the objectives of this
newsgroup, right? Or is it all about exchanging the next to impossible
solution to the never to happen unreal world problems?

Regards,
Alex van der Spek

D'Arcy J.M. Cain · Aug 16, 2010

Nope, no homework. This was a serious question from a serious but perhaps
simple physicist who grew up with Algol, FORTRAN and Pascal, taught himself
VB(A) and is looking for a replacement of VB and finding that in Python. You
can guess my age now.

Most of my work I do in R nowadays but R is not flexible enough for some
file manipulation operations. I use the book by Lutz ("Learning Python").
The join method for strings is in there. I did not have the book at hand and
I was jetlagged too. I do apologize for asking a simple question.

I'm not actually the one that presented the convuluted example. I
think the one who did just felt that someone had a question and they
were passing it to the group instead of doing a simple Google search.
The "solution" he posted looked like something designed to make the
teacher scratch his head and ask embarrassing questions of the student.

Thanks for the help. I do hope there is no entry level for using this group.
If there is, I won't meet it for a while.

I think that the only thing people expect is that you do a quick search
first and show that you have tried first. Some questions have been
asked and answered so many times that a search of the archives finds
what you want without waiting for an answer.

D'Arcy J.M. Cain · Aug 16, 2010

Perhaps the ones here who think I was trying to make you do my homework can

You keep replying to my message but as I pointed out in my previous
message, I'm not the one who thought that you posted a homework
question. I'm the one who thought that the other poster thought that
you posted a homework question. Honestly, while I thought it was a
question that could have been answered faster with a Google search, it
did not look like a homework question to me.

actually help me for real. Since I run my own company (not working for any
of the big ones) I can't afford official training in anything. So I teach
myself, help is always welcome and sought for. If that feels like doing
homework for me, so be it.

Well, it is "home" work but there is nothing wrong with asking for help
anyway. When people complain about homework questions it is generally
because someone has posted the question verbatim from the assignment
and asks for a complete solution. That's annoying. What you have done
here is good because you show some work and ask for help with it.
Slightly better would be to ask specific questions about areas that you
are struggling with but this is good.

The fact is that I do try to learn Python. It can do things I thought
required much more coding. Look at the attached. It builds a concordance
table first. That was an excercise from a book on Pascal programming. In
Pascal the solution is 2 pages of code. In Python it is 8 lines. Beautiful!

I guess the real entry level test here is that you have to be smart
enough to choose Python since it is the best language. You pass.

John Posner · Aug 16, 2010

Anybody catches any other ways to improve my program (attached), you are
most welcome.

1. You don't need to separate out special characters (TABs, NEWLINEs,
etc.) in a string. So:

bt='-999.25'+'\t''-999.25'+'\t''-999.25'+'\t''-999.25'+'\t'+'-999.25'

.... can be ...

bt='-999.25\t-999.25\t-999.25\t-999.25\t-999.25'

BTW, I think you made a couple of "lucky errors" in this statement.
Where there are two consecutive apostrophe (') characters, did you mean
to put a plus sign in between? Your statement is valid because the
Python interpreter concatenates strings for you:
True
True

2. Take a look at the functions in the os.path module:

http://docs.python.org/library/os.path.html

These functions might simplify your pathname manipulations. (I didn't
look closely enough to know for sure.)

3. An alternative to:

alf.write(tp+'\t'+vf+'\t'+vq+'\t'+al+'\t'+bt+'\t'+vs+'\n')

... is ...

alf.write("\t".join((tp, vf, vq, al, bt, vs)) + "\n")

4. I suggest using a helper function to bring that super-long
column-heading line (alf.write('Timestamp ...) under control:

def multi_field_names(base_name, count, sep_string):
names = [base_name + " " + str(i) for i in range(1, count+1)]
return sep_string.join(names)

HTH,
John

Stefan Schwarzer · Aug 17, 2010

Hi Alex,

Anybody catches any other ways to improve my program (attached), you are
most welcome. Help me learn, that is one of the objectives of this
newsgroup, right? Or is it all about exchanging the next to impossible
solution to the never to happen unreal world problems?

I don't know what a concordance table is, and I haven't
looked a lot into your program, but anyway here are some
things I noticed at a glance:

| #! usr/bin/env python
| # Merge log files to autolog file
| import os
| import fileinput
| #top='C:\\Documents and Settings\\avanderspek\\My Documents\\CiDRAdata\\Syncrude\\CSL\\August2010'
| top='C:\\Users\\ZDoor\\Documents\\CiDRA\\Syncrude\CSL\\August2010'

If you have backslashes in strings, you might want to use
"raw strings". Instead of "c:\\Users\\ZDoor" you'd write
r"c:\Users\ZDoor" (notice the r in front of the string).

| i,j,k=0,0,0
| date={}

I suggest to use more spacing to make the code more
readable. Have a look at

http://www.python.org/dev/peps/pep-0008/

for more formatting (and other) tips.

| fps=0.3048
| tab='\t'
|
| bt='-999.25'+'\t''-999.25'+'\t''-999.25'+'\t''-999.25'+'\t'+'-999.25'

If these numbers are always the same, you should use
something like

NUMBER = "-999.25"
COLUMNS = 5
bt = "\t".join(COLUMNS * [NUMBER])

(with better naming, of course).

Why don't you use `tab` here?

I _highly_ recommend to use longer (unabbreviated) names.

| al='Status'+'\t'+'State'+'\t'+'-999.25'
|
| for root,dirs,files in os.walk(top):
| #Build a concordance table of days on which data was collected
| for name in files:
| ext=name.split('.',1)[1]

There's a function `splitext` in `os.path`.

| if ext=='txt':
| dat=name.split('_')[1].split('y')[1]
| if dat in date.keys():

You can just write `if dat in date` (in Python versions >=
2.2, I think).

| date[dat]+=1
| else:
| date[dat]=1
| print 'Concordance table of days:'
| print date
| print 'List of files processed:'
| #Build a list of same day filenames, 5 max for a profile meter,skip first and last days
| for f in sorted(date.keys())[2:-1]:
| logs=[]
| for name in files:
| ext=name.split('.')[1]
| if ext=='txt':
| dat=name.split('_')[1].split('y')[1]

I guess I'd move the parsing stuff (`x.split(s)` etc.)
into small functions with meaningful names. After that I'd
probably notice there's much redundancy and refactor them.

| if dat==f:
| logs.append(os.path.join(root,name))
| #Open the files and read line by line
| datsec=False
| lines=[[] for i in range(5)]

One thing to watch out for: The above is different from
`[[]] * 5` which uses the _same_ empty list for all entries.
Probably the semantics you chose is correct.

| fn=0
| for line in fileinput.input(logs):
| if line.split()[0]=='DataID':
| datsec=True
| ln=0
| if datsec:
| lines[fn].append(line.split())
| ln+=1
| if ln==10255:

This looks like a "magic number" and should be turned into a
constant.

| datsec=False
| fileinput.nextfile()
| fn+=1
| print fileinput.filename().rsplit('\\',1)[1]
| fileinput.close()
| aut='000_AutoLog'+f+'.log'
| out=os.path.join(root,aut)
| alf=open(out,'w')
| alf.write('Timestamp (mm/dd/yyyy hh:mm:ss) VF 1 VF 2 VF 3 VF 4 VF 5 Q 1 Q 2 Q 3 Q 4 Q 5 Status State Metric Band Temperature 1 Band Temperature 2 Band Temperature 3 Band Temperature 4 Band Temperature 5 SPL 1 SPL 2 SPL 3 SPL 4 SPL 5'+'\n')
| for wn in range(1,10255,1):

You don't need to write the step argument if it's 1.

| for i in range(5):
| lines[wn][2]=str(float(lines[wn][2])/fps)
| tp=lines[0][wn][0]+' '+lines[0][wn][1]
| vf=tab.join([lines[wn][2] for i in range(5)])
| vq=tab.join([lines[wn][3] for i in range(5)])
| vs=tab.join([lines[wn][4] for i in range(5)])
| #sf=tab.join([lines[wn][5] for i in range(5)])
| #sq=tab.join([lines[wn][6] for i in range(5)])
| #ss=tab.join([lines[wn][7] for i in range(5)])

Maybe use an extra function?

def choose_a_better_name():
return tab.join([lines[index][wn][2] for index in range(5)])

Moreover, the repetition of this line looks as if you wanted
to put the right hand sides of the assignments in a list,
instead of assigning to distinct names (`vf` etc.).

By the way, you use the number 5 a lot. I guess this should
be a constant, too.

| alf.write(tp+'\t'+vf+'\t'+vq+'\t'+al+'\t'+bt+'\t'+vs+'\n')

Suggestion: Use

tab.join([tp, vf, vq, al, bt, vs]) + "\n"

Again, not using distinct variables would have an advantage
here.

| alf.close()
| print "Done"

Stefan

Neil Cerutti · Aug 17, 2010

Hi Alex,

I don't know what a concordance table is, and I haven't
looked a lot into your program, but anyway here are some
things I noticed at a glance:

| #! usr/bin/env python
| # Merge log files to autolog file
| import os
| import fileinput
| #top='C:\\Documents and Settings\\avanderspek\\My Documents\\CiDRAdata\\Syncrude\\CSL\\August2010'
| top='C:\\Users\\ZDoor\\Documents\\CiDRA\\Syncrude\CSL\\August2010'

If you have backslashes in strings, you might want to use "raw
strings". Instead of "c:\\Users\\ZDoor" you'd write
r"c:\Users\ZDoor" (notice the r in front of the string).

That's good general advice. But in the specific case of file
paths, using '/' as the separator is supported, and somewhat
preferable.

Grant Edwards · Aug 17, 2010

That's good general advice. But in the specific case of file
paths, using '/' as the separator is supported, and somewhat
preferable.

Unless you're going to be passing them to cmd.exe or other utilities
via subprocess/popen.

News123 · Aug 17, 2010

Unless you're going to be passing them to cmd.exe or other utilities
via subprocess/popen.

in that case you could use os.path.normpath() prior to passing it to an
external program und use slashies internally.

A little less performant, but in my opinion nicer typing.

opposite function to split?	1	Apr 29, 2006
WAVE file writing, confused about setsampwidth(n)	0	Nov 24, 2011
split lines from stdin into a list of unicode strings	0	Aug 28, 2013
Appending to dictionary of lists	2	May 3, 2011
Numpy.array with dtype works on list of tuples not on list of lists?	2	Sep 18, 2011
How to join a range of array string slots with blank delimiters? Almost opposite of string.split()?	5	Aug 25, 2011
Yet another "split string by spaces preserving single quotes" problem	1	May 13, 2012
Text file with mixed end-of-line terminations	2	Aug 31, 2011

Opposite of split

Alex van der Spek

Wieland Hoffmann

Gary Herron

Steven Howe

Steven D'Aprano

Roy Smith

D'Arcy J.M. Cain

Steven D'Aprano

Alex van der Spek

Alex van der Spek

D'Arcy J.M. Cain

D'Arcy J.M. Cain

John Posner

Stefan Schwarzer

Neil Cerutti

Grant Edwards

News123

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads