formatting a listing

G

George George

i have a listing which looks like this
1
2
3

3
4
5

20
3
5
5

i would like a quick way to format that output such that i end up with 3
columns like the following.
1,2,20
2,4,3
3,5,5
5

any ideas?
thanks
 
R

Robert Dober

i have a listing which looks like this
1
2
3

3
4
5

20
3
5
5

i would like a quick way to format that output such that i end up with 3
columns like the following.
1,2,20
2,4,3
3,5,5
=A0 =A05

any ideas?
thanks

If this is a homework assignement I am terribly sorry, but I cannot know.

puts DATA.inject( [ [] ] ){ | ary, ele |
ele.strip.empty? ? ary << [] : ary.last << ele.to_i # or ele.chomp
if you prefer
ary
}
reject( &:empty? ) # .map( &:join, ", " ) is not standard Ruby
map{ | eles | eles.join( ", " )}

__END__
1
2

42
44
46

2000

HTH
Robert
--=20
Toutes les grandes personnes ont d=92abord =E9t=E9 des enfants, mais peu
d=92entre elles s=92en souviennent.

All adults have been children first, but not many remember.

[Antoine de Saint-Exup=E9ry]
 
P

Pascal J. Bourguignon

George George said:
i have a listing which looks like this
1
2
3

3
4
5

20
3
5
5

i would like a quick way to format that output such that i end up with 3
columns like the following.
1,2,20
2,4,3
3,5,5
5

any ideas?

It's hard to tell, since there seem to be random changes in the output
data vs. the input. Well, one random change. Or do you really mean
to subtract 1 from the first row, second column of the output?
What about the missing commas? Why the numbers are left-aligned?




(printf "%s\n\n" ,
(begin
(matrix = (Array . new))
(ncols = 1)
(matrix . push(Array . new))
(((IO . readlines'/tmp/test.data') .
map { | line | (line . strip) }) .
each { | item | (if (item == "")
(ncols = (ncols + 1))
(matrix . push(Array . new))
else
((matrix . at(ncols - 1)) . push item)
end)
})
(height = ((matrix . map { | column | (column . size) }) . max))
(widths = (matrix . map { | column | ((column . map { | cell | ((sprintf("%s" , cell)) . size) }) . max) }))
((((matrix . map { | column | (column + (Array . new((height - (column . size)) , ""))) }) .
transpose) .
map { | row | ((([ row , widths ] . transpose) .
map { | r , w | (sprintf( "%*s" , w , r)) }) .
join ",")}) .
join "\n")
end))

1,3,20
2,4, 3
3,5, 5
, , 5
 
G

George George

Robert said:
3
any ideas?
thanks

If this is a homework assignement I am terribly sorry, but I cannot
know.

puts DATA.inject( [ [] ] ){ | ary, ele |
ele.strip.empty? ? ary << [] : ary.last << ele.to_i # or ele.chomp
if you prefer
ary
}
.reject( &:empty? ) # .map( &:join, ", " ) is not standard Ruby
.map{ | eles | eles.join( ", " )}

__END__
1
2

42
44
46

2000

HTH
Robert

Thank so much for pointing the way: And this is not assignment! :) Am
working on some huge datasets that run in to up to 20 000 columns and
was looking for an easy way of creating the columns.


puts DATA.inject( [ [] ] ){ | ary, ele |
ele.strip.empty? ? ary << [] : ary.last << ele.to_i # or ele.chomp
#if you prefer
# ary
}
#.reject( &:empty? ) # .map( &:join, ", " ) is not standard Ruby
#.map{ | eles | eles.join( ", " )}

__END__
1
2

42
44
46

am not trying to ask too much but your code is real witchcraft :)
raises an error
`<<': can't convert Array into Integer (TypeError)
 
R

Robert Dober

Robert said:
3
any ideas?
thanks

If this is a homework assignement I am terribly sorry, but I cannot
know.

puts DATA.inject( [ =A0[] ] ){ | ary, ele |
=A0 ele.strip.empty? ? ary << [] : ary.last << ele.to_i # or ele.chomp
if you prefer
=A0 ary
}
.reject( &:empty? ) =A0# .map( &:join, ", " ) is not standard Ruby
.map{ | eles | eles.join( ", " )}

__END__
1
2

42
44
46

2000

HTH
Robert

Thank so much for pointing the way: And this is not assignment! :) Am
working on some huge datasets that run in to up to 20 000 columns and
was looking for an easy way of creating the columns.
If you have perf problems do not worry, inject is slow replace
inject(x){ |a,e|
}
with
loc =3D x
each { |e|
loc << e # e.g
}
loc
# you will have a speedup of 2~3
 
G

George George

Pascal said:
It's hard to tell, since there seem to be random changes in the output
data vs. the input. Well, one random change. Or do you really mean
to subtract 1 from the first row, second column of the output?
What about the missing commas? Why the numbers are left-aligned?
thank you so much for the reply.
The first column is a column of numbers separated by some white space.
The idea is to capture the first block and create a column and the
capture the next one and create another column adjacent to the previous
one such that
if
1
2
3
whitespace(one or more)
3
4
5
6
whitespace(one or more)
20
2
 
R

Robert Dober

Really Pascal, you do *not* need to sign your mails ;)



--=20
Toutes les grandes personnes ont d=92abord =E9t=E9 des enfants, mais peu
d=92entre elles s=92en souviennent.

All adults have been children first, but not many remember.

[Antoine de Saint-Exup=E9ry]
 
G

George George

.....
such that every block of numbers separated by white space becomes a new
row in the output e.g for he above case ....

what i meant was column not row(sorry)
what i need is to transform that into (call it a matrix)
such that every block of numbers separated by white space becomes a new
column in the output e.g for he above case
 
G

George George

(printf "%s\n\n" ,
(begin
(matrix = (Array.new))
(ncols = 1)
(matrix.push(Array.new))
(((IO.readlines'/home/george/test.data').
map { | line | (line.strip) }).
each { | item | (if (item == "")
(ncols = (ncols + 1))
(matrix.push(Array.new))
else
((matrix.at(ncols - 1)).push item)
end)
})
(height = ((matrix.map { | column | (column.size) }).max))
(widths = (matrix.map { | column | ((column.map { | cell |
((sprintf("%s" , cell)).size) }).max) }))
((((matrix.map { | column | (column + (Array.new((height -
(column.size)) , ""))) }) .
transpose).
map { | row | ((([ row , widths ].transpose).
map { | r , w | (sprintf( "%*s" , w , r)) }) .
join ",")}).
join "\n")
end))

produces an error:
in `sprintf': no implicit conversion from nil to integer (TypeError)
 
P

Pascal J. Bourguignon

George George said:
(printf "%s\n\n" ,
(begin
(matrix = (Array.new))
(ncols = 1)
(matrix.push(Array.new))
(((IO.readlines'/home/george/test.data').
map { | line | (line.strip) }).
each { | item | (if (item == "")
(ncols = (ncols + 1))
(matrix.push(Array.new))
else
((matrix.at(ncols - 1)).push item)
end)
})
(height = ((matrix.map { | column | (column.size) }).max))
(widths = (matrix.map { | column | ((column.map { | cell |
((sprintf("%s" , cell)).size) }).max) }))
((((matrix.map { | column | (column + (Array.new((height -
(column.size)) , ""))) }) .
transpose).
map { | row | ((([ row , widths ].transpose).
map { | r , w | (sprintf( "%*s" , w , r)) }) .
join ",")}).
join "\n")
end))

produces an error:
in `sprintf': no implicit conversion from nil to integer (TypeError)


Well, I tried it with this file:
-----
1
2
3

3
4
5

20
3
5
5

-----

You may try to execute it expression per expression, and see where that nil comes from?


irb(main):850:0>
(matrix = (Array . new))
[]
irb(main):856:0>
(ncols = 1)
1
irb(main):862:0>
(matrix . push(Array . new))
[[]]
irb(main):868:0>
(((IO . readlines'/tmp/test.data') .
map { | line | (line . strip) }) .
each { | item | (if (item == "")
(ncols = (ncols + 1))
(matrix . push(Array . new))
else
((matrix . at(ncols - 1)) . push item)
end)
})
["1", "2", "3", "", "3", "4", "5", "", "20", "3", "5", "5"]
irb(main):890:0>
(height = ((matrix . map { | column | (column . size) }) . max))
4
irb(main):896:0>
(widths = (matrix . map { | column | ((column . map { | cell | ((sprintf("%s" , cell)) . size) }) . max) }))
[1, 1, 2]
irb(main):902:0>
((((matrix . map { | column | (column + (Array . new((height - (column . size)) , ""))) }) .
transpose) .
map { | row | ((([ row , widths ] . transpose) .
map { | r , w | (sprintf( "%*s" , w , r)) }) .
join ",")}) .
join "\n")
"1,3,20\n2,4, 3\n3,5, 5\n , , 5"
irb(main):918:0>
(begin
(matrix = (Array . new))
(ncols = 1)
(matrix . push(Array . new))
(((IO . readlines'/tmp/test.data') .
map { | line | (line . strip) }) .
each { | item | (if (item == "")
(ncols = (ncols + 1))
(matrix . push(Array . new))
else
((matrix . at(ncols - 1)) . push item)
end)
})
(height = ((matrix . map { | column | (column . size) }) . max))
(widths = (matrix . map { | column | ((column . map { | cell | ((sprintf("%s" , cell)) . size) }) . max) }))
matrix
((((matrix . map { | column | (column + (Array . new((height - (column . size)) , ""))) }) .
transpose) .
map { | row | ((([ row , widths ] . transpose) .
map { | r , w | (sprintf( "%*s" , w , r)) }) .
join ",")}) .
join "\n")
end)
"1,3,20\n2,4, 3\n3,5, 5\n , , 5"
irb(main):968:0>
(matrix = (Array . new))
[]
irb(main):974:0>
(ncols = 1)
1
irb(main):980:0>
(matrix . push(Array . new))
[[]]
irb(main):986:0>
(((IO . readlines'/tmp/test.data') .
map { | line | (line . strip) }) .
each { | item | (if (item == "")
(ncols = (ncols + 1))
(matrix . push(Array . new))
else
((matrix . at(ncols - 1)) . push item)
end)
})
["1", "2", "3", "", "3", "4", "5", "", "20", "3", "5", "5"]
irb(main):1005:0>
matrix
[["1", "2", "3"], ["3", "4", "5"], ["20", "3", "5", "5"]]
irb(main):1008:0>
(height = ((matrix . map { | column | (column . size) }) . max))
4
irb(main):1014:0>
(widths = (matrix . map { | column | ((column . map { | cell | ((sprintf("%s" , cell)) . size) }) . max) }))
[1, 1, 2]
irb(main):1020:0>
matrix
[["1", "2", "3"], ["3", "4", "5"], ["20", "3", "5", "5"]]
irb(main):1026:0>
((((matrix . map { | column | (column + (Array . new((height - (column . size)) , ""))) }) .
transpose) .
map { | row | ((([ row , widths ] . transpose) .
map { | r , w | (sprintf( "%*s" , w , r)) }) .
join ",")}) .
join "\n")
"1,3,20\n2,4, 3\n3,5, 5\n , , 5"
irb(main):1042:0>
 
G

George George

Thanks a lot Pascal! The code is working though a little complex but a
very nice learning piece of code. one nitty gritty thing is that not all
the columns are seperated by a comma. for example your code

File.open("/home/george/test_r.csv",'w') do |f|
printf "%s\n\n",
begin
matrix = Array.new
ncols = 1
matrix.push(Array.new)
(IO.readlines '/home/george/test.data').
map { | line | (line.strip) }.
each { | item | if item == ""
ncols = (ncols + 1)
matrix.push(Array.new)
else
matrix.at(ncols - 1).push item
end
}

height = (matrix.map { | column | (column.size) }).max

widths = matrix.map { | column | ((column.map { | cell |
((sprintf("%s" , cell)).size) }).max) }

f.puts((matrix.map { | column | (column + (Array.new((height -
column.size) , ""))) }).transpose.
map { | row | ((([ row , widths ].transpose).
map { | r , w | (sprintf( "%*s" , w , r)) }).join ",")}.join "\n")
end
end

when given a file(test.data) containing
1 2
2 4

1 4
2 3

1 3
2 3
3 5
4 6

it produces (test_r.csv)

1 2,1 4,1 3
2 4,2 3,2 3
, ,3 5
, ,4 6

which is almost there!
the ideal output would have been

1, 2,1, 4,1, 3
2, 4,2, 3,2, 3
, , ,3, 5
, , ,4, 6

Sorry am asking for a horse and the saddle :) what little modification
should i do to produce the second output?
Thank you so much!

George
 
P

Pascal J. Bourguignon

George George said:
Thanks a lot Pascal! The code is working though a little complex but a
very nice learning piece of code. one nitty gritty thing is that not all
the columns are seperated by a comma. for example your code

File.open("/home/george/test_r.csv",'w') do |f|
printf "%s\n\n",
begin
matrix = Array.new
ncols = 1
matrix.push(Array.new)
(IO.readlines '/home/george/test.data').
map { | line | (line.strip) }.
each { | item | if item == ""
ncols = (ncols + 1)
matrix.push(Array.new)
else
matrix.at(ncols - 1).push item
end
}

height = (matrix.map { | column | (column.size) }).max

widths = matrix.map { | column | ((column.map { | cell |
((sprintf("%s" , cell)).size) }).max) }

f.puts((matrix.map { | column | (column + (Array.new((height -
column.size) , ""))) }).transpose.
map { | row | ((([ row , widths ].transpose).
map { | r , w | (sprintf( "%*s" , w , r)) }).join ",")}.join "\n")
end
end

when given a file(test.data) containing
1 2
2 4

1 4
2 3

1 3
2 3
3 5
4 6

it produces (test_r.csv)

1 2,1 4,1 3
2 4,2 3,2 3
, ,3 5
, ,4 6

which is almost there!
the ideal output would have been

1, 2,1, 4,1, 3
2, 4,2, 3,2, 3
, , ,3, 5
, , ,4, 6

Sorry am asking for a horse and the saddle :) what little modification
should i do to produce the second output?

You could add a split and a each somewhere.


irb(main):001:0> "1 2".split(" ").each{|x| puts x}
1
2
["1", "2"]
 
B

Brian Candler

George said:
i have a listing which looks like this
1
2
3

3
4
5

20
3
5
5

i would like a quick way to format that output such that i end up with 3
columns like the following.
1,2,20
2,4,3
3,5,5
5

any ideas?

src = "1\n2\n3\n\n3\n4\n5\n\n20\n3\n5\n5"
rows = src.split("\n\n").collect { |b| b.split("\n") }
m = rows.collect { |r| r.size }.max
m.times { |i| puts rows.collect { |r| r }.join(",") }

No witchcraft there.

I'd say it's better to use fastercsv gem for outputting, as it handles
values which need quoting (e.g. values which themselves contain commas)

require 'rubygems'
require 'fastercsv'
src = "1\n2\n3\n\n3\n4\n5\n\n20\n3\n5,9\n5"
rows = src.split("\n\n").collect { |b| b.split("\n") }
m = rows.collect { |r| r.size }.max
FCSV { |out| m.times { |i| out << rows.collect { |r| r } } }
 
G

George George

Brian said:
George George wrote:
any ideas?

src = "1\n2\n3\n\n3\n4\n5\n\n20\n3\n5\n5"
rows = src.split("\n\n").collect { |b| b.split("\n") }
m = rows.collect { |r| r.size }.max
m.times { |i| puts rows.collect { |r| r }.join(",") }

No witchcraft there.

I'd say it's better to use fastercsv gem for outputting, as it handles
values which need quoting (e.g. values which themselves contain commas)

require 'rubygems'
require 'fastercsv'
src = "1\n2\n3\n\n3\n4\n5\n\n20\n3\n5,9\n5"
rows = src.split("\n\n").collect { |b| b.split("\n") }
m = rows.collect { |r| r.size }.max
FCSV { |out| m.times { |i| out << rows.collect { |r| r } } }


Hi Brian!
Thanks for the example however since i want to have the code read from a
file and write to another file i have done this
File.open("/home/george/output.csv","w") do |f|
src = ""
File.open('/home/george/sequences/results_processed.csv').each do
|rows|
src << rows
end
rows = src.split("\n\n").collect { |b| b.split("\n") }
m = rows.collect { |r| r.size }.max
m.times { |i| f.puts rows.collect { |r| r }.join(",") }

but it prints out the same old original file without modifications

Any ideas?
 
D

David A. Black

Hi --

Thanks a lot Pascal! The code is working though a little complex but a
very nice learning piece of code. one nitty gritty thing is that not all
the columns are seperated by a comma. for example your code

File.open("/home/george/test_r.csv",'w') do |f|
printf "%s\n\n",
begin
matrix = Array.new
ncols = 1
matrix.push(Array.new)
(IO.readlines '/home/george/test.data').
map { | line | (line.strip) }.

Don't use the extra parentheses around expressions. They result in
unidiomatic and obfuscated code. They're an artifact of some very
specific disgruntlement about the fact that Ruby differs from Common
Lisp, and they shouldn't be emulated. (Search the ruby-talk archives
for more on this if interested.)
when given a file(test.data) containing
1 2
2 4

1 4
2 3

1 3
2 3
3 5
4 6

it produces (test_r.csv)

1 2,1 4,1 3
2 4,2 3,2 3
, ,3 5
, ,4 6

which is almost there!
the ideal output would have been

1, 2,1, 4,1, 3
2, 4,2, 3,2, 3
, , ,3, 5
, , ,4, 6

Sorry am asking for a horse and the saddle :) what little modification
should i do to produce the second output?

Could you write an example using letters, instead of digits? I'm
having trouble mapping the input to the output, and having unique
symbols would help.

Judging from the comma count, the last two rows have fewer fields than
the first two. Is that right?


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
"Ruby 1.9: What You Need To Know" Envycasts with David A. Black
http://www.envycasts.com
 
B

Brian Candler

George said:
but it prints out the same old original file without modifications

It works for me (I just copy-pasted your code and changed the
input/output filenames).

The simplest way to debug this is to add some extra debugging statements
to see if the variables contain what you expect, for example:

...
rows = src.split("\n\n").collect { |b| b.split("\n") }
p rows # << debugging statement

or more verbosely,

STDERR.puts "rows = #{rows.inspect}"

At a guess, perhaps you're on a Windows platform and the paragraphs are
terminated by \r\n\r\n instead of \n\n. Try:

rows = src.split(/\r?\n\r?\n/).collect ... etc

In a regular expression, \r? means 0 or 1 occurrences of \r
 
G

George George

Brian said:
At a guess, perhaps you're on a Windows platform and the paragraphs are
terminated by \r\n\r\n instead of \n\n. Try:

rows = src.split(/\r?\n\r?\n/).collect ... etc

In a regular expression, \r? means 0 or 1 occurrences of \r

Thank you Brian. Am on ubuntu linux and Ruby 1.8.7. Let me put some
debugger and work it out why its not printing the desirable output.

@David Black:
My input consists of digits and not letters, that is why i used digits
in the examples above; in short what i really wanted is: Given a listing
1
2
3
white space
3.05
4
5
whitespace
-6.3
7

Produce an output such that:
1,-3.05,-6.3
2,4,7
3,5

or
1,-3.05,-6.3
2,4,7
3,5,0
 
D

David A. Black

@David Black:
My input consists of digits and not letters, that is why i used digits
in the examples above; in short what i really wanted is: Given a listing
1
2
3
white space
3.05
4
5
whitespace
-6.3
7

Produce an output such that:
1,-3.05,-6.3

That's a spurious - before the 3, right?
2,4,7
3,5

or
1,-3.05,-6.3
2,4,7
3,5,0

The example that I couldn't quite figure out was the one with input
like:

1 2,3 4

or something like that. Anyway, a simple version (which may not handle
that case) is:

$/ = "\n\n" # if the input is definitely \n\n delimited

cols = File.open("input.csv") do |fh|
max = 0
fh.map do |s|
row = s.scan(/\S+/)
max = [max, row.size].max
row << "0" until row.size == max
row
end
end


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
"Ruby 1.9: What You Need To Know" Envycasts with David A. Black
http://www.envycasts.com
 
G

George George

David said:
whitespace
-6.3
7

Produce an output such that:
1,-3.05,-6.3

That's a spurious - before the 3, right?
2,4,7
3,5

or
1,-3.05,-6.3
2,4,7
3,5,0

The example that I couldn't quite figure out was the one with input
like:

1 2,3 4

or something like that. Anyway, a simple version (which may not handle
that case) is:

$/ = "\n\n" # if the input is definitely \n\n delimited

cols = File.open("input.csv") do |fh|
max = 0
fh.map do |s|
row = s.scan(/\S+/)
max = [max, row.size].max
row << "0" until row.size == max
row
end
end


David

Hi David!
Thanks for the reply but the output given by the above code again
only produces a single column instead of multiple columns. instead of
given
1
2
3
white space
3.05
4
5
whitespace
-6.3
7

Produce an output such that:
1,-3.05,-6.3
2,4,7
3,5

Please If you don't mind i can send you my actual input file off the
list to try with. my email is georgkam hosted with the google email
domain
 
D

David A. Black

Hi --

David said:
whitespace
-6.3
7

Produce an output such that:
1,-3.05,-6.3

That's a spurious - before the 3, right?
2,4,7
3,5

or
1,-3.05,-6.3
2,4,7
3,5,0

The example that I couldn't quite figure out was the one with input
like:

1 2,3 4

or something like that. Anyway, a simple version (which may not handle
that case) is:

$/ = "\n\n" # if the input is definitely \n\n delimited

cols = File.open("input.csv") do |fh|
max = 0
fh.map do |s|
row = s.scan(/\S+/)
max = [max, row.size].max
row << "0" until row.size == max
row
end
end


David

Hi David!
Thanks for the reply but the output given by the above code again
only produces a single column instead of multiple columns. instead of
given
1
2
3
white space
3.05
4
5
whitespace
-6.3
7

Produce an output such that:
1,-3.05,-6.3
2,4,7
3,5

Here's my input and output, along with the program:

$ cat george.rb
$/ = "\n\n"

cols = File.open("input.csv") do |fh|
max = 0
fh.map do |s|
row = s.scan(/\S+/)
max = [max, row.size].max
row << "0" until row.size == max
row
end
end

rows = cols.transpose.map {|row| row.join(",") }

puts rows
$ cat input.csv
1
2
3

3.05
4
5

-6.3
7
$ ruby george.rb
1,3.05,-6.3
2,4,7
3,5,0

The columns look OK. Note that the $/="\n\n" thing only works if your
input is separated by exactly that sequence. That might be the problem
you're having.


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
"Ruby 1.9: What You Need To Know" Envycasts with David A. Black
http://www.envycasts.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top