space deliminated to comma delinated with varried and need spaces between some columns

LHradowy · Sep 20, 2004

I have file that looks like this...
1555002 00 0 04 27 TELN NOT BILL
3555007 00 0 06 00 CUSTOMER HAS

1

5555410 00 0 12 10 CUSTOMER HAS

1

6755012 00 0 12 06 CUSTOMER HAS

1

Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...

I need to created a perl script that takes this file and makes it look like
this
1555002,00 0 04 27,TELN NOT BILL
3555007,00 0 06 00,CUSTOMER HAS > 1
5555410,00 0 12 10,CUSTOMER HAS > 1
6755012,00 0 12 06,CUSTOMER HAS > 1

This output needs to be written to a file.
I have no idea how to start, if I split on a space " " the it will spit the
third an fourth column up. The fourth column can basically be left alone.

Thanks for the help.

Jürgen Exner · Sep 20, 2004

LHradowy said:
I have file that looks like this...
1555002 00 0 04 27 TELN
NOT BILL 3555007 00 0 06 00
CUSTOMER HAS
5555410 00 0 12 10
CUSTOMER HAS
6755012 00 0 12 06
CUSTOMER HAS

Notice the white spaces at beginning of the line, I DONT WANT THEM
THERE

Please see the thread "
Replacing spaces" that was discussed here over the weekend.

Notice the white spaces in the 2nd and 3rd columns, I NEED THEM
THERE...

The solutions posted in the thread mentioned above will leave those alone.

I need to created a perl script that takes this file

perldoc -f open

perldoc perlop (and check for said:
and makes it look like this
1555002,00 0 04 27,TELN NOT BILL
3555007,00 0 06 00,CUSTOMER HAS > 1
5555410,00 0 12 10,CUSTOMER HAS > 1
6755012,00 0 12 06,CUSTOMER HAS > 1

This output needs to be written to a file.

perldoc -f open
perldoc -f print

I have no idea how to start, if I split on a space " " the it will
spit the third an fourth column up. The fourth column can basically
be left alone.

So, what is the distinguishing difference between the separator for the
items in the third column on the one hand and the separator between the
third column and the fourth column on the other hand?

jue

Shawn Corey · Sep 20, 2004

Hi,

If the data is in fixed columns, you can use substr.

perldoc -f substr

--- Shawn

Ian Wilson · Sep 20, 2004

LHradowy said:
I have file that looks like this...
1555002 00 0 04 27 TELN NOT BILL
3555007 00 0 06 00 CUSTOMER HAS

5555410 00 0 12 10 CUSTOMER HAS

6755012 00 0 12 06 CUSTOMER HAS

Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...

I need to created a perl script that takes this file and makes it look like
this
1555002,00 0 04 27,TELN NOT BILL
3555007,00 0 06 00,CUSTOMER HAS > 1
5555410,00 0 12 10,CUSTOMER HAS > 1
6755012,00 0 12 06,CUSTOMER HAS > 1

This output needs to be written to a file.
I have no idea how to start, if I split on a space " " the it will spit the
third an fourth column up. The fourth column can basically be left alone.

Thanks for the help.

If the data always has multiple spaces (ASCII 32) between fields, I'd
try stripping the leading spaces and then converting >1 consecutive
spaces to commas:

perl -e -p 's/^ +//; s/ +/,/g' oldfile > newfile

But I expect Shawn's substr solution to be more robust. Using unpack may
be another useful approach.

Tore Aursand · Sep 20, 2004

I have file that looks like this...
1555002 00 0 04 27 TELN NOT BILL
3555007 00 0 06 00 CUSTOMER HAS
5555410 00 0 12 10 CUSTOMER HAS
6755012 00 0 12 06 CUSTOMER HAS

Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...

I need to created a perl script that takes this file and makes it look like
this
1555002,00 0 04 27,TELN NOT BILL
3555007,00 0 06 00,CUSTOMER HAS > 1
5555410,00 0 12 10,CUSTOMER HAS > 1
6755012,00 0 12 06,CUSTOMER HAS > 1

If we skip everything that has got to do with the file(s), here's a
suggestion (untested);

while ( <DATA> ) {
chomp; # Get rid of line breaks
s,^\s+,,; # Remove leading spaces
my @cols = split( /\s+{2,}/, $_ ); # Split on two (or more) spaces
print join( ',', @cols ) . "\n";
}

Gunnar Hjalmarsson · Sep 20, 2004

Tore said:
If we skip everything that has got to do with the file(s), here's a
suggestion (untested);

while ( <DATA> ) {
chomp; # Get rid of line breaks
s,^\s+,,; # Remove leading spaces
my @cols = split( /\s+{2,}/, $_ ); # Split on two (or more) spaces

-----------------------------^^^^^

Maybe you should have tested it... ;-)

LHradowy · Sep 20, 2004

Tore Aursand said:
If we skip everything that has got to do with the file(s), here's a
suggestion (untested);

while ( <DATA> ) {
chomp; # Get rid of line breaks
s,^\s+,,; # Remove leading spaces
my @cols = split( /\s+{2,}/, $_ ); # Split on two (or more) spaces
print join( ',', @cols ) . "\n";
}

Ahhh, I think I am forgetting something, THIS is exactly what I want!
But I am getting an error when I run it, and my skills at perl are weak.
#!/opt/perl/bin/perl

use strict;
use warnings;

while (<>) {
chomp; # Will remove the leading , or new line
s,^\s+,,; #Remove leading spaces
my @cols=split(/\s+{2,}/,$_); #Split on two (or more) spaces
print join (',',@cols)."\n";
}

user@server$ ./test.pl file
Nested quantifiers in regex; marked by <-- HERE in m/\s+{ <-- HERE 2,}/ at
../test.pl line 10.

LHradowy · Sep 21, 2004

Ian Wilson said:
If the data always has multiple spaces (ASCII 32) between fields, I'd
try stripping the leading spaces and then converting >1 consecutive
spaces to commas:

perl -e -p 's/^ +//; s/ +/,/g' oldfile > newfile

But I expect Shawn's substr solution to be more robust. Using unpack may
be another useful approach.

I like this but I get nothing back in the new file. And I have no tabs they
are all spaces.

Tore Aursand · Sep 21, 2004

-----------------------------^^^^^

Maybe you should have tested it... ;-)

You are so right, Gunnar, and I'm terribly sorry. The correct split()
should - of course - look like this:

my @cols = split( /\s{2,}/, $_ );

Still untested, though.

Tore Aursand · Sep 21, 2004

my @cols=split(/\s+{2,}/,$_); #Split on two (or more) spaces

My fault. Don't split on '\s+{2,}', but on '\s{2,}';

my @cols = split( /\s{2,}/, $_ );

Anno Siegel · Sep 21, 2004

LHradowy said:
I have file that looks like this...
1555002 00 0 04 27 TELN NOT BILL
3555007 00 0 06 00 CUSTOMER HAS
5555410 00 0 12 10 CUSTOMER HAS
6755012 00 0 12 06 CUSTOMER HAS

Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...

I need to created a perl script that takes this file and makes it look like
this
1555002,00 0 04 27,TELN NOT BILL
3555007,00 0 06 00,CUSTOMER HAS > 1
5555410,00 0 12 10,CUSTOMER HAS > 1
6755012,00 0 12 06,CUSTOMER HAS > 1

This output needs to be written to a file.
I have no idea how to start, if I split on a space " " the it will spit the
third an fourth column up. The fourth column can basically be left alone.

while ( <DATA> ) {
my @l = split;
print join( ',', $l[ 0], "@l[ 1 .. 4]", "@l[ 5 .. $#l]"), "\n";
}

Anno

Larry Felton Johnson · Sep 21, 2004

LHradowy said:
I have file that looks like this...
1555002 00 0 04 27 TELN NOT BILL
3555007 00 0 06 00 CUSTOMER HAS
5555410 00 0 12 10 CUSTOMER HAS
6755012 00 0 12 06 CUSTOMER HAS

Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...

I need to created a perl script that takes this file and makes it look like
this
1555002,00 0 04 27,TELN NOT BILL
3555007,00 0 06 00,CUSTOMER HAS > 1
5555410,00 0 12 10,CUSTOMER HAS > 1
6755012,00 0 12 06,CUSTOMER HAS > 1

This output needs to be written to a file.
I have no idea how to start, if I split on a space " " the it will spit the
third an fourth column up. The fourth column can basically be left alone.

Thanks for the help.

I get the idea I may be oversimplifying or misunderstanding some part
of this question, but if there is a uniform number of columns, and
components within
the columns a simple regex should do it, and it's a matter of just
reconstructing it with a print statement with the spacing you want.

perl -pi.bak -e 's/^\s+(\w+)\s+(\w+)\s+(\w+)\s+(\w+)\s+(\w+)\s+(.*)/$1,$2
$3 $4 $5,$6/g' spaces

In my first pass the long and ugly oneliner above did it for me when I
cut and pasted your file snippet into a file called spaces. This
edited in place and copied the old file to spaces.bak
If there's a need to write it to a file of another name the same regex
could
be wrapped in a script opening the infile for reading and the outfile
for writing.

How about it? Am I misunderstanding something here?

Ian Wilson · Sep 21, 2004

LHradowy said:
I like this but I get nothing back in the new file. And I have no
tabs they are all spaces.

C:\> type oldname.txt
1555002 00 0 04 27 TELN NOT
BILL
3555007 00 0 06 00 CUSTOMER
HAS > 1
5555410 00 0 12 10 CUSTOMER
HAS > 1
6755012 00 0 12 06 CUSTOMER
HAS > 1

C:\> perl -p -e "s/^ +//; s/ +/,/g" oldname.txt
1555002,00 0 04 27,TELN NOT BILL
3555007,00 0 06 00,CUSTOMER HAS > 1
5555410,00 0 12 10,CUSTOMER HAS > 1
6755012,00 0 12 06,CUSTOMER HAS > 1

I recall some versions of Perl on some versions of Windows have problems
with redirecting STDOUT to a file from a command prompt / DOS window.
Maybe you have one of those combinations?

Larry Felton Johnson · Sep 22, 2004

perl -pi.bak -e 's/^\s+(\w+)\s+(\w+)\s+(\w+)\s+(\w+)\s+(\w+)\s+(.*)/$1,$2
$3 $4 $5,$6/g' spaces

A couple of followup things. My g option above (after the last '/'
was a typo. It didn't hurt or help, but was superfluous.

The second is that the whole approach to looking at lines in a file
like this bears a little bit of discussion. When I looked at the
lines, the first thing that entered my mind wasn't "How do I get rid
of the spaces?" but "What always seems to be true about these lines?"

Basically you're looking at a line like this

some spaces, some digits,space,digits,space,digits,space,digits,space,digits,space,some
variable text with no necessity to format.

I could have used \d+ instead of \w+, but everything in the match
breaks down to
\w+, \s+ or .*

So there are only three types of things to match, digits, spaces and
the "everything else" trailing at the end.

Given this a number of the approaches people have given will all work:
regex,
splitting into an array, substr (if the positions are uniform) and
unpack (if the positions are uniform). The task is to capture the
nonspace stuff into usable variables and print them out with inserted
whitespace and any punctuation or labeling characters you choose.
This mental approach gives you much more control over the formatting
and use of the data than thinking of it as
simply not wanting the spaces at the beginning of line, but wanting to
preserve some of the spaces in the middle.

LHradowy · Sep 22, 2004

I want to thank all who of you that have spent time onthis problem. what a
tremendous response!

Colspan probs	2	May 21, 2026
Perl calling ps - COLUMNS ignored?	1	Jul 10, 2013
join on space instead of comma	9	Aug 4, 2004
Perl script to clean up file -- Dont know if it can be done	6	Sep 22, 2004
Hello I am learning how to code and I tried making a calculator with HTML and js with some CSS I am stuck at thing, Like the screen value is	0	Mar 13, 2025
Need help with this script	4	Mar 12, 2023
Need help with <rowspan> in an HTML table	1	Nov 6, 2024
I Need Help with making a function that draws in a canvas using location data.	1	Dec 17, 2021

space deliminated to comma delinated with varried and need spaces between some columns

LHradowy

Jürgen Exner

Shawn Corey

Ian Wilson

Tore Aursand

Gunnar Hjalmarsson

LHradowy

LHradowy

Tore Aursand

Tore Aursand

Anno Siegel

Larry Felton Johnson

Ian Wilson

Larry Felton Johnson

LHradowy

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads