Convert potion of string to Octal

B

Bill H

If I have a string that may contain characters with an ascii value of
128 or greater, other than looking at each character, is there a way of
converting each occurance of these characters to an octal
representation in the format of \### where ### is the octal value (the
slash is required in the output)?

For example, if I have a string that is:

"The next character is octal ü"

I want to convert it to the following format:

"The next character is octal \374"

If there is something in the perldocs on how to do this please let me
know where to look

Thank you,

Bill H
 
J

John W. Krahn

Bill said:
If I have a string that may contain characters with an ascii value of
128 or greater, other than looking at each character, is there a way of
converting each occurance of these characters to an octal
representation in the format of \### where ### is the octal value (the
slash is required in the output)?

For example, if I have a string that is:

"The next character is octal ü"

I want to convert it to the following format:

"The next character is octal \374"

If there is something in the perldocs on how to do this please let me
know where to look


s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;



John
 
P

Paul Lalli

Bill said:
If I have a string that may contain characters with an ascii value of
128 or greater,

I believe that's a contradiction in terms. ASCII goes from 0 to 127
(0x00 to 0x7F).
other than looking at each character, is there a way of
converting each occurance of these characters to an octal
representation in the format of \### where ### is the octal value (the
slash is required in the output)?

For example, if I have a string that is:

"The next character is octal ü"

I want to convert it to the following format:

"The next character is octal \374"

If there is something in the perldocs on how to do this please let me
know where to look

perldoc -f perlre
perldoc -f sprintf
perldoc -f ord

#!/usr/bin/perl -l
use strict;
use warnings;

$_ = 'This is my ü string';
s { ([\x80-\xFF]) }
{ sprintf '\\%o', ord($1) }gex;
print;
__END__

This is my \374 string


Paul Lalli
 
B

Bill H

John said:
Bill said:
If I have a string that may contain characters with an ascii value of
128 or greater, other than looking at each character, is there a way of
converting each occurance of these characters to an octal
representation in the format of \### where ### is the octal value (the
slash is required in the output)?

For example, if I have a string that is:

"The next character is octal ü"

I want to convert it to the following format:

"The next character is octal \374"

If there is something in the perldocs on how to do this please let me
know where to look


s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;

Thanks for the quick response but that converts all the characters - is
there way for it to only do those with ascii values over 127?

I did a quick test of it using:

$a = "The next character is octal ü" ;
$a =~ s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;
print $a;

Bill H
 
B

brian d foy

Bill said:
For example, if I have a string that is:

"The next character is octal ü"

I want to convert it to the following format:

"The next character is octal \374"

The sprintf and printf have the %o format for octal numbers,
and you can use the ord() builtin to get the numeric value of
the character.

printf "The next character is octal \\%o", ord( $char );
 
P

Paul Lalli

Bill said:
John said:
Bill H wrote:
s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;

Thanks for the quick response but that converts all the characters - is
there way for it to only do those with ascii values over 127?

I did a quick test of it using:

$a = "The next character is octal ü" ;
$a =~ s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;
print $a;

You did something wrong. When I run those three lines of code - by
copy and pasting your post - I get:
The next character is octal \374

What output did you obtain?

Paul Lalli
 
B

brian d foy

Bill H said:
John W. Krahn wrote:
s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;
Thanks for the quick response but that converts all the characters - is
there way for it to only do those with ascii values over 127?
I did a quick test of it using:

$a = "The next character is octal ü" ;
$a =~ s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;
print $a;

That script works for me. The character class is negated and excludes
all of the ASCII characters, so the substitution only works on the
non-ASCII ones.
 
B

Bill H

Paul said:
Bill said:
John said:
Bill H wrote:
s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;

Thanks for the quick response but that converts all the characters - is
there way for it to only do those with ascii values over 127?

I did a quick test of it using:

$a = "The next character is octal ü" ;
$a =~ s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;
print $a;

You did something wrong. When I run those three lines of code - by
copy and pasting your post - I get:
The next character is octal \374

What output did you obtain?

Every character was converted to the \octal value. I replaced the
^[:ascii:] with the part of the code you gave [\x80-\xFF] and it worked
fine after that.
 
C

ced

Bill said:
Paul said:
Bill said:
...
s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;


Thanks for the quick response but that converts all the characters - is
there way for it to only do those with ascii values over 127?

I did a quick test of it using:

$a = "The next character is octal ü" ;
$a =~ s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;
print $a;

You did something wrong. When I run those three lines of code - by
copy and pasting your post - I get:
The next character is octal \374

What output did you obtain?

Every character was converted to the \octal value. I replaced the
^[:ascii:] with the part of the code you gave [\x80-\xFF] and it worked
fine after that.

Your expression looks ok but you'd see a mass conversion if you
inserted
a space between the [ and ^ like this:

s/ ( [ ^[:ascii:] ] ) / sprintf '\%o', ord $1 /gex;
|
|
space

The regex 'x' modifier ignores (most) whitespace but not within a
character
class.

hth,
 
B

Bill H

Bill said:
Paul said:
Bill H wrote:
...
s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;


Thanks for the quick response but that converts all the characters - is
there way for it to only do those with ascii values over 127?

I did a quick test of it using:

$a = "The next character is octal ü" ;
$a =~ s/ ( [^[:ascii:]] ) / sprintf '\%o', ord $1 /gex;
print $a;

You did something wrong. When I run those three lines of code - by
copy and pasting your post - I get:
The next character is octal \374

What output did you obtain?

Every character was converted to the \octal value. I replaced the
^[:ascii:] with the part of the code you gave [\x80-\xFF] and it worked
fine after that.

Your expression looks ok but you'd see a mass conversion if you
inserted
a space between the [ and ^ like this:

s/ ( [ ^[:ascii:] ] ) / sprintf '\%o', ord $1 /gex;
|
|
space

The regex 'x' modifier ignores (most) whitespace but not within a
character
class.

There was a space in it - so that would explain it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top