stuck in regex

E

elliot

elliot@dan:/media/KINGSTON$ perl input1.pl
1 #include <stdio.h>
2 #include <malloc.h>
3
4 void testc(double **pa)
5 {
6 double b;
7 double *a;
8 int m;
9
10 a = (double*) malloc(sizeof(double)*5);
11 a[0]=1.23;
12 a[1]=2.46;
13 a[2]=3.69;
14 a[3]=4.11;
15 a[4]=7.21;
16 *pa=a;
17 for (m=0;m<5;m++)
18 {
19 b=a[m];
20 b=b+1.0;
21 a[m]=b;
22 }
23 }


// gcc -c -Wall -Wextra cfile.c -o testc.o
elliot@dan:/media/KINGSTON$ cat input1.pl
#!/usr/bin/perl -w

use strict;
my $filename = "cfile.c";
open FILE, $filename or die $!;
while (<FILE>) { print $_; }





elliot@dan:/media/KINGSTON$

How do I s/// so as to adios the integers?
 
J

Jürgen Exner

elliot said:
use strict;
my $filename = "cfile.c";
open FILE, $filename or die $!;
while (<FILE>) { print $_; }

How do I s/// so as to adios the integers?

while (<FILE>) {
s/\d+/adios/g;
print $_; }
}

That was easy.

jue
 
J

Jürgen Exner

Tad McClellan said:
But it will change the floating points that are in the data too...

:)

How so? In e.g.
3.1415E01
_I_ see just three natural numbers, separated by a '.' and an 'E'. ;-)

However, I did forget about the optional signum in integers, so the OP
may want to add that to the RE.

jue
 
E

elliot

How so? In e.g.
3.1415E01
_I_ see just three natural numbers, separated by a '.' and an 'E'. ;-)

However, I did forget about the optional signum in integers, so the OP
may want to add that to the RE.

jue

Even facetious replies helped me figure this out. It took me a while to
figure out why the 2 was disappearing in the ultimate line, but I think
I got it. It compiles:

elliot@dan:/media/KINGSTON$ perl input2.pl
elliot@dan:/media/KINGSTON$ cat cfile2.c
#include <stdio.h>
#include <malloc.h>
void testc(double **pa)
{
double b;
double *a;
int m;
a = (double*) malloc(sizeof(double)*5);
a[0]=1.23;
a[1]=2.46;
a[2]=3.69;
a[3]=4.11;
a[4]=7.21;
*pa=a;
for (m=0;m<5;m++)
{
b=a[m];
b=b+1.0;
a[m]=b;
}
}
//gcc -c -Wall -Wextra cfile2.c -o testc.o tja
elliot@dan:/media/KINGSTON$ gcc -c -Wall -Wextra cfile2.c -o testc.o
elliot@dan:/media/KINGSTON$ ls -l testc.o
-rw-r--r-- 1 elliot elliot 1048 2011-07-23 23:55 testc.o
elliot@dan:/media/KINGSTON$ cat input2.pl
#!/usr/bin/perl -w
use strict;
my $filename = "cfile.c";
open FILE, $filename or die $!;
open FILE2, ">cfile2.c" or die $!;
while (<FILE>) {
#s/(?<![.\d])\d+(?![.\d])//g;
s/\d+// unless m%/%;
s/\s+//;
print FILE2;
}
elliot@dan:/media/KINGSTON$

The script that I would think were more elegant and useful would be one
that would remove an integer if it is the first word in a line, as well
as whitespace thereafter.
 
U

Uri Guttman

e> #!/usr/bin/perl -w

use warnings is better. it won't trigger warnings in modules you use.
e> my $filename = "cfile.c";
e> open FILE, $filename or die $!;
e> open FILE2, ">cfile2.c" or die $!;
e> while (<FILE>) {
e> #s/(?<![.\d])\d+(?![.\d])//g;
e> s/\d+// unless m%/%;

that is a noisy alternate delimiter. {} is the best one for that in most cases.
e> s/\s+//;
e> print FILE2;
e> }

if you want to edit the file in place, you can use File::Slurp's
edit_file_lines.

use File::Slurp qw( edit_file_lines ) ;

edit_file_lines { s/\d+// unless m{/}; s/\s+//; } $file_name ;

done.

uri
 
S

sln

How so? In e.g.
3.1415E01
_I_ see just three natural numbers, separated by a '.' and an 'E'. ;-)

However, I did forget about the optional signum in integers, so the OP
may want to add that to the RE.

jue

Even facetious replies helped me figure this out. It took me a while to
figure out why the 2 was disappearing in the ultimate line, but I think
I got it. It compiles:

elliot@dan:/media/KINGSTON$ perl input2.pl
elliot@dan:/media/KINGSTON$ cat cfile2.c
#include <stdio.h>
#include <malloc.h>
void testc(double **pa)
{
double b;
double *a;
int m;
a = (double*) malloc(sizeof(double)*5);
a[0]=1.23;
a[1]=2.46;
a[2]=3.69;
a[3]=4.11;
a[4]=7.21;
*pa=a;
for (m=0;m<5;m++)
{
b=a[m];
b=b+1.0;
a[m]=b;
}
}
//gcc -c -Wall -Wextra cfile2.c -o testc.o tja
elliot@dan:/media/KINGSTON$ gcc -c -Wall -Wextra cfile2.c -o testc.o
elliot@dan:/media/KINGSTON$ ls -l testc.o
-rw-r--r-- 1 elliot elliot 1048 2011-07-23 23:55 testc.o
elliot@dan:/media/KINGSTON$ cat input2.pl
#!/usr/bin/perl -w
use strict;
my $filename = "cfile.c";
open FILE, $filename or die $!;
open FILE2, ">cfile2.c" or die $!;
while (<FILE>) {
#s/(?<![.\d])\d+(?![.\d])//g;
s/\d+// unless m%/%;
s/\s+//;
print FILE2;
}
elliot@dan:/media/KINGSTON$

The script that I would think were more elegant and useful would be one
that would remove an integer if it is the first word in a line, as well
as whitespace thereafter.

That script produces this output:

#include<stdio.h>
#include<malloc.h>
voidtestc(double **pa)
{doubleb;
double*a;
intm;
a= (double*) malloc(sizeof(double)*);
a[]=1.23;a[]=2.46;a[]=3.69;a[]=4.11;a[]=7.21;*pa=a;for(m=;m<5;m++)
{b=a[m];b=b+.0;a[m]=b;}}


Doesen't make a lot of sence.

-sln
 
E

elliot

perl -pe 's/^\s*\d+\s+//' cfile.c

Nice:

$ perl -pe 's/^\s*\d+\s+//' cfile3.c
#include <stdio.h>
#include <malloc.h>
void testc(double **pa)
{
double b;
double *a;
int m;
a = (double*) malloc(sizeof(double)*5);
a[0]=1.23;
a[1]=2.46;
a[2]=3.69;
a[3]=4.11;
a[4]=7.21;
*pa=a;
for (m=0;m<5;m++)
{
b=a[m];
b=b+1.0;
a[m]=b;
}
}

# perl -pe 's/^\s*\d+\s+//' cfile3.c
// gcc -c -Wall -Wextra cfile2.c -o testc.o
$

Sometimes simple and elegant are the same things.
 
E

elliot

That script produces this output:

#include<stdio.h>
#include<malloc.h>
voidtestc(double **pa)
{doubleb;
double*a;
intm;
a= (double*) malloc(sizeof(double)*);
a[]=1.23;a[]=2.46;a[]=3.69;a[]=4.11;a[]=7.21;*pa=a;for(m=;m<5;m++)
{b=a[m];b=b+.0;a[m]=b;}}


Doesen't make a lot of sence.

What input files were you using?

$ cat cfile3.c
1 #include <stdio.h>
2 #include <malloc.h>
3
4 void testc(double **pa)
5 {
6 double b;
7 double *a;
8 int m;
9
10 a = (double*) malloc(sizeof(double)*5);
11 a[0]=1.23;
12 a[1]=2.46;
13 a[2]=3.69;
14 a[3]=4.11;
15 a[4]=7.21;
16 *pa=a;
17 for (m=0;m<5;m++)
18 {
19 b=a[m];
20 b=b+1.0;
21 a[m]=b;
22 }
23 }


// gcc -c -Wall -Wextra cfile2.c -o testc.o

Something like this?
 
S

sln

That script produces this output:

#include<stdio.h>
#include<malloc.h>
voidtestc(double **pa)
{doubleb;
double*a;
intm;
a= (double*) malloc(sizeof(double)*);
a[]=1.23;a[]=2.46;a[]=3.69;a[]=4.11;a[]=7.21;*pa=a;for(m=;m<5;m++)
{b=a[m];b=b+.0;a[m]=b;}}


Doesen't make a lot of sence.

What input files were you using?

$ cat cfile3.c
1 #include <stdio.h>
2 #include <malloc.h>
3
4 void testc(double **pa)
5 {
6 double b;
7 double *a;
8 int m;
9
10 a = (double*) malloc(sizeof(double)*5);
11 a[0]=1.23;
12 a[1]=2.46;
13 a[2]=3.69;
14 a[3]=4.11;
15 a[4]=7.21;
16 *pa=a;
17 for (m=0;m<5;m++)
18 {
19 b=a[m];
20 b=b+1.0;
21 a[m]=b;
22 }
23 }


// gcc -c -Wall -Wextra cfile2.c -o testc.o

Something like this?

The line numbering wasn't apparent (to me) as being the thing you
wanted stripped. Why not just use a copy of the original?

But, if that's the case, its easy to strip.
Below is what you would do to preserve the original source formatting
(incase of a compile error that you might need a reference back to).

-sln

use strict;
use warnings;

$/ = undef;

my $src = <DATA>;
my $maxpad = 20;

while ($src =~ / ^ [^\S\n]* \d+ ([^\S\n]+) /xmg) {
if (length $1 < $maxpad) {
$maxpad = length $1
}
}

print "\nmaximum whitespace pad after line number = $maxpad\n\n",'-'x20,"\n\n";

$src =~ s/ ^ [^\S\n]*\d+[^\S\n]{0,$maxpad} //xmg;

print "'$src'\n";


__DATA__
1 #include <stdio.h>
2 #include <malloc.h>
3
4 void testc(double **pa)
5 {
6 double b;
7 double *a;
8 int m;
9
10 a = (double*) malloc(sizeof(double)*5);
11 a[0]=1.23;
12 a[1]=2.46;
13 a[2]=3.69;
14 a[3]=4.11;
15 a[4]=7.21;
16 *pa=a;
17 for (m=0;m<5;m++)
18 {
19 b=a[m];
20 b=b+1.0;
21 a[m]=b;
22 }
23 }

Output:

maximum whitespace pad after line number = 1

--------------------

'#include <stdio.h>
#include <malloc.h>

void testc(double **pa)
{
double b;
double *a;
int m;

a = (double*) malloc(sizeof(double)*5);
a[0]=1.23;
a[1]=2.46;
a[2]=3.69;
a[3]=4.11;
a[4]=7.21;
*pa=a;
for (m=0;m<5;m++)
{
b=a[m];
b=b+1.0;
a[m]=b;
}
}'
 
S

sln

[...]
Doesen't make a lot of sence.

What input files were you using?

$ cat cfile3.c
1 #include <stdio.h> [...]
23 }


// gcc -c -Wall -Wextra cfile2.c -o testc.o

Something like this?

The line numbering wasn't apparent (to me) as being the thing you
wanted stripped. Why not just use a copy of the original?

But, if that's the case, its easy to strip.
Below is what you would do to preserve the original source formatting
(incase of a compile error that you might need a reference back to).

I'm not going to try to read your mind but some things.

The regex becomes much more complicated, depending on how
(or what) generated the line numbered source file.

For example, maybe a generator did this:

9 a+=4;

10 a &= 65535;

as part of its self formatting.
Or, there might be other hidden 'formatting' peculiarities/issue's,
even some that might be C-parsing knowledge related. Some editors are
famous for that. Or, there could be cutn'paste issue's.

Some of the above-mentioned issues could affect the compilers interpretation
of the source, significantly different from what you might expect.

So, be forewarned.

-sln
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top