What are these extra spaces in my registry?

J

joe

Hi all,
I am writting a perl script to check my windows 2000 registry to make
sure some keys were added after I installed some software. In my
script I use:
system "regedit \e ....
To dump the registry to a file, I then open this file and do following
to see if
my key is in the file:
while (<fhTestFile>) {
if (index ($_, $testValue) > -1) {
close (fhTestFile);
return ("TRUE");
}
}

The problem I have is that the registry file looks like there are
space between
each char in it. So I maybe looking for:
LOCAL_MACHINE_SOFTWARE
The file contains:
L O C C L _ M A C H I N E _ S O F T W A R E
So my match does not work, also I don't thing the blank spaces are
just extra spaces. Because I added spaces to $testValue and it still
didn't match. Does anyone have any idea what this extra char is? And
how I can get ride of it? I already tried addind chomp $_ before the
if and that didn't help.

Thanks,
Zim





zzzz
 
A

A. Sinan Unur

(e-mail address removed) (joe) wrote in
Hi all,
I am writting a perl script to check my windows 2000 registry to make
sure some keys were added after I installed some software. In my
script I use:
system "regedit \e ....
....

The problem I have is that the registry file looks like there are
space between
each char in it. So I maybe looking for:
LOCAL_MACHINE_SOFTWARE
The file contains:
L O C C L _ M A C H I N E _ S O F T W A R E

Uhm ... Please do not retype this sort of information. It would have been
best if you had given us a hex dump of the data.

My gues, under the circumstances, is that there is a 0 byte between each
ASCII character because the native character set of Win2K is 16 bits wide
(I do not think it is UTF-16, if such a thing exists).
So my match does not work, also I don't thing the blank spaces are
just extra spaces. Because I added spaces to $testValue and it still
didn't match. Does anyone have any idea what this extra char is? And
how I can get ride of it?

I don't think I know _the_ right way of dealing with that.

Sinan.
 
B

Ben Morrow

Quoth (e-mail address removed) (joe):
Hi all,
I am writting a perl script to check my windows 2000 registry to make
sure some keys were added after I installed some software. In my
script I use:
system "regedit \e ....
To dump the registry to a file,

There are at least two registry modules, which you may find easier to
use.
I then open this file and do following
to see if
my key is in the file:
while (<fhTestFile>) {

Use lexical FHs instead. See 'Indirect Filehandles' in perlopentut.
if (index ($_, $testValue) > -1) {

I would use a regex for this (I'd bet it's as fast, though I haven't
tested it and it almost certainly doesn't matter anyway; it's certainly
clearer to someone who knows Perl)

if (/\Q$testvalue/) {
close (fhTestFile);

If you use lexical FHs they close themselves.
return ("TRUE");

This is in general a bad idea: it's better to use 1 for true and '' or
undef for false (which is what Perl itself uses).
}
}

The problem I have is that the registry file looks like there are
space between
each char in it. So I maybe looking for:
LOCAL_MACHINE_SOFTWARE
The file contains:
L O C C L _ M A C H I N E _ S O F T W A R E
So my match does not work, also I don't thing the blank spaces are
just extra spaces. Because I added spaces to $testValue and it still
didn't match. Does anyone have any idea what this extra char is? And
how I can get ride of it?

The file is in UTF16, M$'s preferred format for text nowadays. Under
perl5.8, you can use

open my $TESTFILE, '<:encoding(utf16)', $filename or die ...;

to have it decoded for you; under earlier Perl versions you can use
Unicode::String or you may be able to get away with knowing that the
'spaces' are in fact null bytes "\000".
I already tried addind chomp $_ before the if and that didn't help.

Im am slightly puzzled as to why you thought it might... does perldoc -f
chomp suggest *anywhere* that it touches the middle of a string?

Ben
 
H

Harry

joe wrote...
The problem I have is that the registry file looks like there are
space between
each char in it. So I maybe looking for:
LOCAL_MACHINE_SOFTWARE
The file contains:
L O C C L _ M A C H I N E _ S O F T W A R E

The "space" characters are \000, as seen by hex dump.

$ cat xx.reg
ÿ_H K E Y _ L O C A L _ M A C H I N E

$ od -bc < xx.reg
0000000 377 376 110 000 113 000 105 000 131 000 137 000 114 000 117 000
377 376 H \0 K \0 E \0 Y \0 _ \0 L \0 O \0
0000020 103 000 101 000 114 000 137 000 115 000 101 000 103 000 110 000
C \0 A \0 L \0 _ \0 M \0 A \0 C \0 H \0
0000040 111 000 116 000 105 000 015 000 012 000 015 000 012 000
I \0 N \0 E \0 \r \0 \n \0 \r \0 \n \0

You can get rid of those \0 characters by tr.

$ tr -d '\0' < xx.reg
ÿ_HKEY_LOCAL_MACHINE
 
B

Brian McCauley

A. Sinan Unur said:
(e-mail address removed) (joe) wrote in
My gues, under the circumstances, is that there is a 0 byte between each
ASCII character because the native character set of Win2K is 16 bits wide
(I do not think it is UTF-16, if such a thing exists).

Er, no. On recent Windows the native character set is Unicode (which is
wider than 16 bits) and the native encoding is UTF-16. UTF-16 exists is
two flavours BE or LE and I can't recall which Win32 uses but it
doesn't matter because Perl will autodetect.
I don't think I know _the_ right way of dealing with that.

I know _a_ right way (in Perl 5.8.x where x may need to be >0, I'm not
sure):

open my $fh, '<:encoding(utf16)', 'somefile.reg' or die $!;
 
B

Ben Morrow

Quoth Brian McCauley said:
Er, no. On recent Windows the native character set is Unicode (which is
wider than 16 bits) and the native encoding is UTF-16. UTF-16 exists is
two flavours BE or LE and I can't recall which Win32 uses but it
doesn't matter because Perl will autodetect.

LE with BOM, as Intel processors are LE. This means you should use
utf16, not utf16le, as utf16le will return the BOM as a character.

Perl can only autodetect if the text has a BOM, BTW, so in general (say,
some text in the middle of a file) it can't.

Ben
 
A

A. Sinan Unur

A. Sinan Unur wrote:

Er, no. On recent Windows the native character set is Unicode (which
is wider than 16 bits) and the native encoding is UTF-16. UTF-16
exists is two flavours BE or LE and I can't recall which Win32 uses
but it doesn't matter because Perl will autodetect.

Ah! Thank you for the correction.

Sinan.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top