extracting strings from a text file

A

Andry

Hi,
I have a text file captured from an SSH session.
Each line of the text looks like this (opened with VI editor):
***********************************************************************************
-rw-r--r-- 1 root root 2389787 Sep 30 10:45 ^[[00mfilename.pl^[[00m
**********************************************************************************
As you can see a lot of spurious/control/special characters are shown
(in VI editor).
I need to extract just the filenames at the end of each line (getting
rid of spurious characters).
The result should be like this:
***********************************************************************************
filename.pl
***********************************************************************************
Of course, I don't know in advance the value of the string to extract
(nor its length to pass to a "substring" function).
Can you suggest any method to extract the single file name at the end
of each line?

Thank you,
Andrea
 
J

Josef Moellers

Andry said:
Hi,
I have a text file captured from an SSH session.
Each line of the text looks like this (opened with VI editor):
***********************************************************************************
-rw-r--r-- 1 root root 2389787 Sep 30 10:45 ^[[00mfilename.pl^[[00m
**********************************************************************************
As you can see a lot of spurious/control/special characters are shown
(in VI editor).
I need to extract just the filenames at the end of each line (getting
rid of spurious characters).
The result should be like this:
***********************************************************************************
filename.pl
***********************************************************************************
Of course, I don't know in advance the value of the string to extract
(nor its length to pass to a "substring" function).
Can you suggest any method to extract the single file name at the end
of each line?

The control characters are ANSI console control sequences.
AFAIK they consist of an ESC character followed by an optional left
angle bracket followed by numbers separated by semicolons followed by a
letter, so you might try to weed out "\033.*?[[:alpha:]]".

Another possibility would be to use the command "/bin/ls" rather than
"ls", the latter is an alias for "ls --color=auto".

HTH,

Josef
 
A

Andry

Andry said:
Hi,
I have a text file captured from an SSH session.
Each line of the text looks like this (opened with VI editor):
***********************************************************************************
-rw-r--r-- 1 root root 2389787 Sep 30 10:45 ^[[00mfilename.pl^[[00m
**********************************************************************************
As you can see a lot of spurious/control/special characters are shown
(in VI editor).
I need to extract just the filenames at the end of each line (getting
rid of spurious characters).
The result should be like this:
***********************************************************************************
filename.pl
***********************************************************************************
Of course, I don't know in advance the value of the string to extract
(nor its length to pass to a "substring" function).
Can you suggest any method to extract the single file name at the end
of each line?

The control characters are ANSI console control sequences.
AFAIK they consist of an ESC character followed by an optional left
angle bracket followed by numbers separated by semicolons followed by a
letter, so you might try to weed out "\033.*?[[:alpha:]]".

Another possibility would be to use the command "/bin/ls" rather than
"ls", the latter is an alias for "ls --color=auto".

HTH,

Josef
--
These are my personal views and not those of Fujitsu Siemens Computers!
Josef Möllers (Pinguinpfleger bei FSC)
        If failure had no penalty success would not be a prize (T..  Pratchett)
Company Details:http://www.fujitsu-siemens.com/imprint.html

Thanks Josef!
The /bin/ls option works great!

Now, I can't get the filename out of the string.
I tried with:
$extract =~ s/^.*?(\w+)\s*$/$1/;
and I got:
*******************
pl
*******************
Then, I tried with:
$extract =~ s/^.*?(\w+)\.(\w+)\s*$/$1/;
and I got:
*******************
filename
*******************
While what I want is:
*******************
filename.pl
*******************

Could you help with that, please?

Thanks,
Andrea
 
T

Tim Greer

Andry said:
$extract =~ s/^.*?(\w+)\.(\w+)\s*$/$1/;

filename is $1 and pl is $2. You also didn't capture \.

So:

$extract =~ s/^.*?(\w+)\.(\w+)\s*$/$1.$2/;

or:

$extract =~ s/^.*?(\w+\.\w+)\s*$/$1/;

You also probably want to check with some type of word boundary so you
get all of "filename" in filename.pl, depending on how the file is
formatted (or could be).
 
C

cartercc

-rw-r--r-- 1 root root 2389787 Sep 30 10:45 ^[[00mfilename.pl^[[00m
Then, I tried with:
$extract =~ s/^.*?(\w+)\.(\w+)\s*$/$1/;
and I got:
*******************
filename
*******************
While what I want is:
*******************
filename.pl
*******************

Could you help with that, please?

UNTESTED

while(<__DATA__>)
{
@line = split;
$filename = $line[9]; #if it IS 9
$filename = s/^.*?(\w+)\.(\w+)\s*$/$1.$2/;
print $filename, "\n";
}

CC
 
T

Tim Greer

cartercc said:
-rw-r--r-- 1 root root 2389787 Sep 30 10:45
^[[00mfilename.pl^[[00m
Then, I tried with:
$extract =~ s/^.*?(\w+)\.(\w+)\s*$/$1/;
and I got:
*******************
filename
*******************
While what I want is:
*******************
filename.pl
*******************

Could you help with that, please?

UNTESTED

while(<__DATA__>)
{
@line = split;
$filename = $line[9]; #if it IS 9
$filename = s/^.*?(\w+)\.(\w+)\s*$/$1.$2/;
print $filename, "\n";
}

CC

Remember, it would start from 0, rather than one. If you use split,
it's [8].
 
T

Tim Greer

Andry said:
Hi,
I have a text file captured from an SSH session.
Each line of the text looks like this (opened with VI editor):
***********************************************************************************
-rw-r--r-- 1 root root 2389787 Sep 30 10:45 ^[[00mfilename.pl^[[00m
**********************************************************************************
As you can see a lot of spurious/control/special characters are shown
(in VI editor).
I need to extract just the filenames at the end of each line (getting
rid of spurious characters).
The result should be like this:
***********************************************************************************
filename.pl
***********************************************************************************
Of course, I don't know in advance the value of the string to extract
(nor its length to pass to a "substring" function).
Can you suggest any method to extract the single file name at the end
of each line?

Thank you,
Andrea

my $line = '-rw-r--r-- 1 root root 2389787 Sep 30 10:45
^[[00mfilename.pl^[[00m';

$line = (split /\s+/, $line)[8];
$line =~ s/\^\[\[00m//g;

One way to do it.
 
A

Andry

Andry said:
Hi,
I have a text file captured from an SSH session.
Each line of the text looks like this (opened with VI editor):

***********************************************************************************> -rw-r--r-- 1 root root 2389787 Sep 30 10:45 ^[[00mfilename.pl^[[00m

**********************************************************************************> As you can see a lot of spurious/control/special characters are shown
(in VI editor).
I need to extract just the filenames at the end of each line (getting
rid of spurious characters).
The result should be like this:

***********************************************************************************> filename.pl

***********************************************************************************
Of course, I don't know in advance the value of the string to extract
(nor its length to pass to a "substring" function).
Can you suggest any method to extract the single file name at the end
of each line?
Thank you,
Andrea

my $line = '-rw-r--r-- 1 root root 2389787 Sep 30 10:45
^[[00mfilename.pl^[[00m';

$line = (split /\s+/, $line)[8];
$line =~ s/\^\[\[00m//g;

One way to do it.
--
Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
and Custom Hosting.  24/7 support, 30 day guarantee, secure servers.
Industry's most experienced staff! -- Web Hosting With Muscle!

Thanks guys!
All your suggestions were very helpful to me.

Andrea
 
J

J. Gleixner

Andry said:
Andry said:
Hi,
I have a text file captured from an SSH session.
Each line of the text looks like this (opened with VI editor):
***********************************************************************************> -rw-r--r-- 1 root root 2389787 Sep 30 10:45 ^[[00mfilename.pl^[[00m

**********************************************************************************> As you can see a lot of spurious/control/special characters are shown
(in VI editor).
I need to extract just the filenames at the end of each line (getting
[...]

If you don't need any of the 'long' output, why not use
the correct option to 'ls' in the first place?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top