parsing Nero .nri file -- possible character set issues (perl newbie)

M

m

redhat 7.2, perl 5.6.1

am trying to extract filenames from Nero .nri file, which has quite a
bit of extra, strange characters. the desired filename is readable
when viewed in wordpad/notepad and the extra junk is a mix of
a-zA-Z0-9 interspersed with boxes (the type that notepad uses when it
doesn't have that character). pico doesn't much like it--won't
display anything. less can display it but the strange characters all
show up as ^@ or ^A or <95>, etc..

anytime i try to open the source .nri file with a perl script, it
outputs all sorts of crazy talk and changes my prompt's characters to
crazy characters, instead of being the normal [user@server dir]. i
have to exit my ssh connection and re-login to return the prompt to
readable text.

1) what is happening to my server when i use perl to access this file?
how can i set it back to human readable w/o logging off?

2) is there a way to tell perl to skip these strange characters (i
don't need whatever data they represent), conert the file--though from
what i am not sure--to utf-8 or the likes, insert some
babelfish::whatchu_talkin_bout module...?

many thanks,

matt
 
S

Simon Taylor

Hello Matt,
redhat 7.2, perl 5.6.1

am trying to extract filenames from Nero .nri file, which has quite a
bit of extra, strange characters.

It would help if you could submit a (small) sample of the data you're
trying to parse, along with the code that you're currently using.

There are a lot of very clever people reading this list, but you have to
help them as much as you can to increase your chances of getting a
useful answer.
anytime i try to open the source .nri file with a perl script, it
outputs all sorts of crazy talk and changes my prompt's characters to
crazy characters,

Is this really what's happening, or is it more accurate to say that the
screen corruption occurs when you *display* the contents of the .nri
file, (and not when you merely open it, as you've implied)?

I hope this helps a little. I look forward to seeing a follow up post
from you.

Regards,

Simon Taylor
 
R

Richard Gration

1) what is happening to my server when i use perl to access this file?
how can i set it back to human readable w/o logging off?

"reset" is a command that will exit Klingon mode in most *nix shells
 
M

m

Simon Taylor said:
Hello Matt,


It would help if you could submit a (small) sample of the data you're
trying to parse, along with the code that you're currently using.

There are a lot of very clever people reading this list, but you have to
help them as much as you can to increase your chances of getting a
useful answer.


Is this really what's happening, or is it more accurate to say that the
screen corruption occurs when you *display* the contents of the .nri
file, (and not when you merely open it, as you've implied)?

I hope this helps a little. I look forward to seeing a follow up post
from you.

Regards,

Simon Taylor

Simon-

Thanks for the response. I tried pasting a portion of the nri file
into this window but it doesn't seem to like that. the whole file is
here: http://mattpepple.com/nero/cg36.zip (6k).

i can pico the nri file, but barely anything displays (maybe 15
characters). i can less the file and it shows all sorts of weirdness,
which i touched on in the original post. none of this has any effect
on the crazy character/screen corruption--that does not occur until i
try to open the file with my perl script.

### begin code sample
open (INFILE, "$curfile") || die "couldn't open $curfile";
open (OUTFILE,">> $destXml") || die "couldn't open $destXml\n";
while ($input = <INFILE>) {
unless ($input =~ /.mp3\s|.ogg\s|.wav\s|.mp4\s|.mp2\s|.wma\s/i)
{next};
$input =~ s/![a-zA-Z0-9-_\s]//;
$filename = substr $input, 60;
foreach $filetype (@filetypes) {
# @filetypes is a list similar to that from line 4: .mp3, .ogg, etc..
$eofname = rindex($input, $filetype);
if ($eofname > 0) {last};
}
$filename = substr($filename, 0, $eofname);
$format = substr($filename,-3,3);
### end code sample

i know line 5 ($input =~ s/![a-zA-Z0-9}//;) doesn't do what i
intended. my thought was to put that in a 'while' control structure
and eliminate any occurrences of non [a-zA-Z0-9\s-_] characters.
other than line 5--which was inserted to handle the nri file's weird
characters--the rest of the script does pretty much what i need it to
when tested against a file i made in pico.

if any more info is needed, jus let me know. i very much appreciate
any help you can offer.

matt
 
M

m

Richard Gration said:
"reset" is a command that will exit Klingon mode in most *nix shells

richard--thanks for the tip, but i'm afraid to report that it does not
correct the problem. anything i type, once in klingon mode, also
appears in klingon (don't know if that helps)

matt
 
B

Ben Morrow

richard--thanks for the tip, but i'm afraid to report that it does not
correct the problem. anything i type, once in klingon mode, also
appears in klingon (don't know if that helps)

Yeah, it will do. What you have to do is (very carefully, 'cos you
can't see what you're doing :) type <Ctrl-C> reset <return>. This
(should) put your terminal back to normal. You may also like to try
'stty sane'.

Ben
 
J

John W. Krahn

m said:
Thanks for the response. I tried pasting a portion of the nri file
into this window but it doesn't seem to like that. the whole file is
here: http://mattpepple.com/nero/cg36.zip (6k).

This will read the file names from your .nri file. HTH

#!/usr/bin/perl
use strict;
use warnings;

my $file = 'cg36.nri';
open my $fh, '<', $file or die "Cannot open $file: $!";

$/ = "\0";
my @files;
while ( <$fh> ) {
chomp;
next unless my ( $len, $name ) = /^(.)(.+\.(?:mp[234]|ogg|wav|wma))
$/si;
die "Name length error!\n" unless length $name == ord $len;
push @files, $name;
}

print "$_\n" for @files;

__END__



John
 
B

Bart Lateur

Jim said:
For example, the file name "Babyface -
Tender Lover - 01 - It's No Crime.mp3 " starts at position 0327
(decimal 215) in the file.

I also noticed the character just in front of it is a chr(48), ("0"),
and that this string is 48 bytes long. So, it definitely looks like a
string length byte.

Maybe the other fields follow at a fixed offset.
 
M

m

redhat 7.2, perl 5.6.1

am trying to extract filenames from Nero .nri file, which has quite a
bit of extra, strange characters. the desired filename is readable
when viewed in wordpad/notepad and the extra junk is a mix of
a-zA-Z0-9 interspersed with boxes (the type that notepad uses when it
doesn't have that character). pico doesn't much like it--won't
display anything. less can display it but the strange characters all
show up as ^@ or ^A or <95>, etc..

anytime i try to open the source .nri file with a perl script, it
outputs all sorts of crazy talk and changes my prompt's characters to
crazy characters, instead of being the normal [user@server dir]. i
have to exit my ssh connection and re-login to return the prompt to
readable text.

1) what is happening to my server when i use perl to access this file?
how can i set it back to human readable w/o logging off?

2) is there a way to tell perl to skip these strange characters (i
don't need whatever data they represent), conert the file--though from
what i am not sure--to utf-8 or the likes, insert some
babelfish::whatchu_talkin_bout module...?

many thanks,

matt

simon, jim, bart, john, richard, ben-

may you live to be 1,000 years old. sorry it's been a few days since
i could utilize these suggestions (busy week) but the information was
exactly what i needed. thank you all so very much.

matt
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top