replace hard return followed by a tab

F

Francois Massion

I have a list of entries in a table which is available as a tab-
separated text. In the first column there is either a string or
nothing, in the second column there is a string.

It looks like this:

Category [Tab] Architecture [Hard return]
[Tab] Technology [Hard return]
[Tab] Medicine [Hard return)

What I want to achieve is:

Category [Tab] Architexture; Technology; Medicine [Hard return]


Here my code:

@entry = <DATA>;
foreach $entry (@entry) {

chomp $entry;
$entry=~ s/(\r|\n){1,}\t/; /mg;
print "$entry\n" ;

}

I have tried different modifiers and more simple replacements like s/\n
\t/; /g;, but to no avail.

Any suggestion?
 
M

Mathias J. Hennig

Francois said:
I have a list of entries in a table which is available as a tab-
separated text. In the first column there is either a string or
nothing, in the second column there is a string.

It looks like this:

Category [Tab] Architecture [Hard return]
[Tab] Technology [Hard return]
[Tab] Medicine [Hard return)

What I want to achieve is:

Category [Tab] Architexture; Technology; Medicine [Hard return]


$ cat data
Category Architecture
Technology
Medicine
Category Architecture
Technology
Medicine
$ cat test.pl
#!perl
use strict;

$_= <>, chomp, print;
chomp, /^\t(.*)/? print "; $1": print "\n$_" while <>;
print "\n";

$ cat data | ./test.pl
Category Architecture; Technology; Medicine
Category Architecture; Technology; Medicine
$


This is the way I would go - just a suggestion... ;)

Greets
Matze
 
J

Jürgen Exner

Francois Massion said:
I have a list of entries in a table which is available as a tab-
separated text. In the first column there is either a string or
nothing, in the second column there is a string.

It looks like this:

Category [Tab] Architecture [Hard return]
[Tab] Technology [Hard return]
[Tab] Medicine [Hard return)

What I want to achieve is:

Category [Tab] Architexture; Technology; Medicine [Hard return]


Here my code:

@entry = <DATA>;
foreach $entry (@entry) {

chomp $entry;

You are removing the \n here, ...
$entry=~ s/(\r|\n){1,}\t/; /mg;

.... therefore it is pointless to search for it here.

Replace the body of the loop with (code untested):

chomp $entry;
print "\n" unless $entry =~ s/^\t/; /;
print $entry;

You will also have to add an additional print "\n"; after the loop to
close the very last line.

jue
 
M

Marc Girod

Any suggestion?

Your input is not made of line records.
So, you need to change the input separator.

Also, I am not sure what is a hard line separator on your platform.
I use \r?\n to cope for both unix and Windows.

So, here is my own go:

use strict;

$/='';
$_ = <DATA>;
s/\r?\n\t/; /mg;
print;

__DATA__
Category Architecture
Technology
Medicine
 
F

Francois Massion

Francois Massion said:
I have a list of entries in a table which is available as a tab-
separated text. In the first column there is either a string or
nothing, in the second column there is a string.
It looks like this:
Category [Tab] Architecture [Hard return]
[Tab] Technology [Hard return]
[Tab] Medicine [Hard return)
What I want to achieve is:
Category [Tab] Architexture; Technology; Medicine [Hard return]
Here my code:
@entry = <DATA>;
foreach $entry (@entry) {
 chomp $entry;

You are removing the \n here, ...
 $entry=~ s/(\r|\n){1,}\t/; /mg;

... therefore it is pointless to search for it here.

Replace the body of the loop with (code untested):

        chomp $entry;
        print "\n" unless $entry =~ s/^\t/; /;
        print $entry;

You will also have to add an additional print "\n"; after the loop to
close the very last line.

jue    - Zitierten Text ausblenden -

- Zitierten Text anzeigen -

That works fine! Thanks a lot.
Francois
 
M

Mathias J. Hennig

Doug said:
Winning today's UUOC (Useless Use Of Cat) Award.

Instead:

/test.pl < data

Yeah, I have won! Never won something before - and now I have got two
prizes: An award and a lesson at no charge! Thank you.
 
S

sln

I have a list of entries in a table which is available as a tab-
separated text. In the first column there is either a string or
nothing, in the second column there is a string.

It looks like this:

Category [Tab] Architecture [Hard return]
[Tab] Technology [Hard return]
[Tab] Medicine [Hard return)

What I want to achieve is:

Category [Tab] Architexture; Technology; Medicine [Hard return]


Here my code:

@entry = <DATA>;
foreach $entry (@entry) {

chomp $entry;
$entry=~ s/(\r|\n){1,}\t/; /mg;
print "$entry\n" ;

}

I have tried different modifiers and more simple replacements like s/\n
\t/; /g;, but to no avail.

Any suggestion?

This is another way. Though, all you really need is:
s/ *\n+\t+ */; /g
s/^\n+//mg

-sln

==============================
output:

Category1 Architecture; Technology; Medicine
Category2 Architecture; Technology; Medicine
Category3 Architecture; Technology; Medicine
Category4 Architecture; Technology; Medicine

Category1 Architecture; Technology; Medicine
Category2 Architecture; Technology; Medicine
Category3 Architecture; Technology; Medicine
Category4 Architecture; Technology; Medicine

-----------------------------
use strict;
use warnings;

my $data = join '', <DATA>;

# Normalize data
$data =~ s/ *\n+\t+ */; /g;
$data =~ s/^\n+//mg;
print $data,"\n";

# Create records from data
#my @lines = split /\n/, $data;
print "$_\n" for (split /\n/, $data);

__DATA__

Category1 Architecture
Technology
Medicine

Category2 Architecture
Technology
Medicine

Category3 Architecture
Technology
Medicine

Category4 Architecture
Technology
Medicine
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top