Trouble using string.tr()

Todd Burch · May 10, 2007

I've coded up EBCDIC to ASCII translate strings. Everything is working
fine, except for the greater than symbol. (In EBCDIC, X'6E').

What is happening is that whenever a > is in the input string, it gets
translated with the character immediately following my '>' in the
TO_STRING.

I've spent a lot of time making sure my characters in the FROM_STRING
and TO_STRING are in the proper order.

I've tried escaping it to no avail. I looked in the doc index for
special notes about it, and did not see any. A Google search found
nothing either.

Any ideas? I can provide the code if needed.

Todd

Alex LeDonne · May 10, 2007

I've coded up EBCDIC to ASCII translate strings. Everything is working
fine, except for the greater than symbol. (In EBCDIC, X'6E').

What is happening is that whenever a > is in the input string, it gets
translated with the character immediately following my '>' in the
TO_STRING.

I've spent a lot of time making sure my characters in the FROM_STRING
and TO_STRING are in the proper order.

I've tried escaping it to no avail. I looked in the doc index for
special notes about it, and did not see any. A Google search found
nothing either.

Any ideas? I can provide the code if needed.

Todd

Please do post the code.

-A

Robert Klemme · May 10, 2007

I've coded up EBCDIC to ASCII translate strings. Everything is working
fine, except for the greater than symbol. (In EBCDIC, X'6E').

What is happening is that whenever a > is in the input string, it gets
translated with the character immediately following my '>' in the
TO_STRING.

I've spent a lot of time making sure my characters in the FROM_STRING
and TO_STRING are in the proper order.

I've tried escaping it to no avail. I looked in the doc index for
special notes about it, and did not see any. A Google search found
nothing either.

Any ideas? I can provide the code if needed.

Yes, that would be good.

robert

Todd Burch · May 10, 2007

Here are the two translate strings. Need more code? Thanks y'all.

$ebcdic_chars = 0x40.chr ; # blank
###$ebcdic_chars += 0xFF.chr ; # unprintable
###$ebcdic_chars += 0x00.chr ; # unprintable

$ebcdic_chars += 0XC1.chr + 0xC2.chr + 0xC3.chr + 0xC4.chr + 0xC5.chr +
0xC6.chr + 0xC7.chr + 0xC8.chr + 0xC9.chr ; # A-I
$ebcdic_chars += 0XD1.chr + 0xD2.chr + 0xD3.chr + 0xD4.chr + 0xD5.chr +
0xD6.chr + 0xD7.chr + 0xD8.chr + 0xD9.chr ; # J-R
$ebcdic_chars += 0xE2.chr + 0xE3.chr + 0xE4.chr + 0xE5.chr +
0xE6.chr + 0xE7.chr + 0xE8.chr + 0xE9.chr ; # S-Z
$ebcdic_chars += 0X81.chr + 0x82.chr + 0x83.chr + 0x84.chr + 0x85.chr +
0x86.chr + 0x87.chr + 0x88.chr + 0x89.chr ; # a-i
$ebcdic_chars += 0X91.chr + 0x92.chr + 0x93.chr + 0x94.chr + 0x95.chr +
0x96.chr + 0x97.chr + 0x98.chr + 0x99.chr ; # j-r
$ebcdic_chars += 0xA2.chr + 0xA3.chr + 0xA4.chr + 0xA5.chr +
0xA6.chr + 0xA7.chr + 0xA8.chr + 0xA9.chr ; # s-z

$ebcdic_chars += 0x4B.chr ; # .
$ebcdic_chars += 0x4C.chr ; # <
$ebcdic_chars += 0x4D.chr ; # (
$ebcdic_chars += 0x4E.chr ; # +
$ebcdic_chars += 0x4F.chr ; # |
$ebcdic_chars += 0x50.chr ; # &

$ebcdic_chars += 0x5A.chr ; # !
$ebcdic_chars += 0x5B.chr ; # $
$ebcdic_chars += 0x5C.chr ; # *
$ebcdic_chars += 0x5D.chr ; # )
$ebcdic_chars += 0x5E.chr ; # ;
$ebcdic_chars += 0x5F.chr ; # ^
$ebcdic_chars += 0x60.chr ; # -
$ebcdic_chars += 0x61.chr ; # /

$ebcdic_chars += 0x6B.chr ; # ,
$ebcdic_chars += 0x6C.chr ; # %
$ebcdic_chars += 0x6D.chr ; # _
$ebcdic_chars += 0x6E.chr ; # >
$ebcdic_chars += 0x6F.chr ; # ?
$ebcdic_chars += 0X79.chr ; # `
$ebcdic_chars += 0x7A.chr ; # :
$ebcdic_chars += 0x7B.chr ; # #
$ebcdic_chars += 0x7C.chr ; # @
$ebcdic_chars += 0x7D.chr ; # '
$ebcdic_chars += 0x7E.chr ; # =
$ebcdic_chars += 0x7F.chr ; # "
$ebcdic_chars += 0xBA.chr ; # [
$ebcdic_chars += 0xBB.chr ; # ]
$ebcdic_chars += 0xC0.chr ; # {
$ebcdic_chars += 0xD0.chr ; # }
$ebcdic_chars += 0xE0.chr ; # \

$ascii_chars = " " ; # blank
###$ascii_chars += " " ; # make ebcdic 0XFF into a blank
###$ascii_chars += ' ' ; # make ebcdic 0X00 into a blank
$ascii_chars += ('A'..'Z').to_a.to_s ; # A-Z
$ascii_chars += ('a'..'z').to_a.to_s ; # a-z

$ascii_chars += '.' ;
$ascii_chars += '<' ;
$ascii_chars += '(' ;
$ascii_chars += '+' ;
$ascii_chars += '|' ;
$ascii_chars += '&' ;

$ascii_chars += '!' ;
$ascii_chars += '$' ;
$ascii_chars += '*' ;
$ascii_chars += ')' ;
$ascii_chars += ';' ;
$ascii_chars += '^' ;
$ascii_chars += '-' ;
$ascii_chars += '/' ;

$ascii_chars += ',' ;
$ascii_chars += '%' ;
$ascii_chars += '_' ;
$ascii_chars += '\>' ;
$ascii_chars += '?' ;
$ascii_chars += '`' ;
$ascii_chars += ':' ;
$ascii_chars += '#' ;
$ascii_chars += '@' ;
$ascii_chars += '\'';
$ascii_chars += '=' ;
$ascii_chars += '"' ;
$ascii_chars += '[' ;
$ascii_chars += ']' ;
$ascii_chars += '{' ;
$ascii_chars += '}' ;
$ascii_chars += '\\' ; # escape char is special - must be doubled

$ebcdic_nums = 0xF0.chr + 0XF1.chr + 0xF2.chr + 0xF3.chr + 0xF4.chr +
0xF5.chr + 0xF6.chr + 0xF7.chr + 0xF8.chr + 0xF9.chr ;
$ascii_nums = "0123456789" ;

$ebcdic_chars += $ebcdic_nums ;
$ascii_chars += $ascii_nums ;

Todd Burch · May 10, 2007

Note, the version I'm showing has the backslash. It fails with or
without it. (and I'm not even certain that \> is even valid!!)

Todd

Robert Klemme · May 10, 2007

Note, the version I'm showing has the backslash. It fails with or
without it. (and I'm not even certain that \> is even valid!!)

=> ">"

The backslash is wrong there.

A more robust way to code this would be to use a HashMap - at least to
initially associate ASCII with EBCDIC chars. So, I'd rather to

CHAR_MAP = {
0XC1 => ?A,
0xC2 => ?B,
# ...
}

Then you can do:

ebcdic, ascii = [CHAR_MAP.keys, CHAR_MAP.values].map do |set|
set.inject("") {|st, ch| st << ch}
end

Kind regards

robert

Rob Biedenharn · May 10, 2007

I've coded up EBCDIC to ASCII translate strings. Everything is
working
fine, except for the greater than symbol. (In EBCDIC, X'6E').

What is happening is that whenever a > is in the input string, it gets
translated with the character immediately following my '>' in the
TO_STRING.

I've spent a lot of time making sure my characters in the FROM_STRING
and TO_STRING are in the proper order.

I've tried escaping it to no avail. I looked in the doc index for
special notes about it, and did not see any. A Google search found
nothing either.

Any ideas? I can provide the code if needed.

Todd

where does the code X'5C' appear? (that's a \ in ASCII) Even in
single quotes, there are two special sequences:
irb(main):001:0> a='\\'
=> "\\"
irb(main):002:0> puts a
\
=> nil
irb(main):003:0> b='\''
=> "'"
irb(main):004:0> puts b
'
=> nil

(but please do show the code!)

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)

Todd Burch · May 10, 2007

I'll need to code up a smaller example than the 400+ lines I have right
now. I'll be able to do that soon. Thanks for the replies so far - I
need to sudy them.

Todd

Todd Burch · May 10, 2007

Here a whole program that illlustrates the error. I'm running under
Windows Ruby 1.8.5. (it's ugly and chopped up, but it works... er...
doesn't work... er... you know what I mean)

Todd

#
# Ruby program to convert a z/OS EBCDIC to ASCII.
#

class Record

#
# The Record.report_values() method used the Record Layout info and
parses a record at a time, building the output ASCII row.
#

def Record.report_values(edata) ; # ebcdic data

adata = edata.tr($ebcdic_chars, $ascii_chars) ; # ascii data
puts "ebcdic data = #{edata}" ;
puts ;
puts "ascii data = #{adata}" ;
puts ;
end ; # def report_values

#
# This Record.to_h() method converts a hex byte into a displayable,
human readable value. i.e. X'F5' to "F5".
#

def Record.to_h(str)
str.unpack('H64').to_s.upcase ;
end ;

#
# This method called to dump the data in hex format.
#

def Record.dump_area(data) ;

fsize = data.size ;
lines,last = fsize.divmod(32) # get # of full lines and figure the
remainder.
i = 0
while lines > 0
str = data[i..i+31]
hexdata = Record.to_h(str) # .unpack('H64').to_s.upcase
printf( "%08X %8s %8s %8s %8s %8s %8s %8s %8s %s\n",
i, hexdata[0..7], hexdata[8..15],
hexdata[16..23], hexdata[24..31],
hexdata[32..39], hexdata[40..47],
hexdata[48..55], hexdata[56..63],
str.tr("\000-\037\177-\377",'.'))
i += 32; lines -= 1
end

# Write out the partial line.

str = data[i..fsize]
line = sprintf("%08x ",i)
i.upto(fsize) {|x|
line << sprintf("%2s",data[x..x].unpack('H2').to_s.upcase)
if (((x+1)% 4)==0) then line << " " end
if (((x+1)%16)==0) then line << " " end
}
line << " "*(2+90-line.length)
line << sprintf("%s\n", str.tr("\000-\037\177-\377",'.'))
printf(line)
end ;

end ; # class Record. This is the end of all the method defintions for
class Record.

def doit(data) ;

#
# Now starts the translate table definitions. They are order-sensitive.
Do not change the order unless you have a real reason to.
#
# $ebcdic_chars (global variable) is the "from string" portion of the
translate process.
# $ascii_chars is the "to string" portion.
#
# $ebcdic_nums is the "from string" for converting from X'F0' through
X'F9' (0-9 ebcdic)
# $ascii_nums is the "to string" for converting to X'30' through
X'39' (0-9 ascii)
#

$ebcdic_chars = 0x40.chr ; # blank
###$ebcdic_chars += 0xFF.chr ; # unprintable
###$ebcdic_chars += 0x00.chr ; # unprintable

$ebcdic_chars += 0XC1.chr + 0xC2.chr + 0xC3.chr + 0xC4.chr + 0xC5.chr +
0xC6.chr + 0xC7.chr + 0xC8.chr + 0xC9.chr ; # A-I
$ebcdic_chars += 0XD1.chr + 0xD2.chr + 0xD3.chr + 0xD4.chr + 0xD5.chr +
0xD6.chr + 0xD7.chr + 0xD8.chr + 0xD9.chr ; # J-R
$ebcdic_chars += 0xE2.chr + 0xE3.chr + 0xE4.chr + 0xE5.chr +
0xE6.chr + 0xE7.chr + 0xE8.chr + 0xE9.chr ; # S-Z
$ebcdic_chars += 0X81.chr + 0x82.chr + 0x83.chr + 0x84.chr + 0x85.chr +
0x86.chr + 0x87.chr + 0x88.chr + 0x89.chr ; # a-i
$ebcdic_chars += 0X91.chr + 0x92.chr + 0x93.chr + 0x94.chr + 0x95.chr +
0x96.chr + 0x97.chr + 0x98.chr + 0x99.chr ; # j-r
$ebcdic_chars += 0xA2.chr + 0xA3.chr + 0xA4.chr + 0xA5.chr +
0xA6.chr + 0xA7.chr + 0xA8.chr + 0xA9.chr ; # s-z

$ebcdic_chars += 0x4B.chr ; # .
$ebcdic_chars += 0x4C.chr ; # <
$ebcdic_chars += 0x4D.chr ; # (
$ebcdic_chars += 0x4E.chr ; # +
$ebcdic_chars += 0x4F.chr ; # |
$ebcdic_chars += 0x50.chr ; # &

$ebcdic_chars += 0x5A.chr ; # !
$ebcdic_chars += 0x5B.chr ; # $
$ebcdic_chars += 0x5C.chr ; # *
$ebcdic_chars += 0x5D.chr ; # )
$ebcdic_chars += 0x5E.chr ; # ;
$ebcdic_chars += 0x5F.chr ; # ^
$ebcdic_chars += 0x60.chr ; # -
$ebcdic_chars += 0x61.chr ; # /

$ebcdic_chars += 0x6B.chr ; # ,
$ebcdic_chars += 0x6C.chr ; # %
$ebcdic_chars += 0x6D.chr ; # _
$ebcdic_chars += 0x6E.chr ; # >
$ebcdic_chars += 0x6F.chr ; # ?
$ebcdic_chars += 0X79.chr ; # `
$ebcdic_chars += 0x7A.chr ; # :
$ebcdic_chars += 0x7B.chr ; # #
$ebcdic_chars += 0x7C.chr ; # @
$ebcdic_chars += 0x7D.chr ; # '
$ebcdic_chars += 0x7E.chr ; # =
$ebcdic_chars += 0x7F.chr ; # "
$ebcdic_chars += 0xBA.chr ; # [
$ebcdic_chars += 0xBB.chr ; # ]
$ebcdic_chars += 0xC0.chr ; # {
$ebcdic_chars += 0xD0.chr ; # }
$ebcdic_chars += 0xE0.chr ; # \

$ascii_chars = " " ; # blank
###$ascii_chars += " " ; # make ebcdic 0XFF into a blank
###$ascii_chars += ' ' ; # make ebcdic 0X00 into a blank
$ascii_chars += ('A'..'Z').to_a.to_s ; # A-Z
$ascii_chars += ('a'..'z').to_a.to_s ; # a-z

$ascii_chars += '.' ;
$ascii_chars += '<' ;
$ascii_chars += '(' ;
$ascii_chars += '+' ;
$ascii_chars += '|' ;
$ascii_chars += '&' ;

$ascii_chars += '!' ;
$ascii_chars += '$' ;
$ascii_chars += '*' ;
$ascii_chars += ')' ;
$ascii_chars += ';' ;
$ascii_chars += '^' ;
$ascii_chars += '-' ;
$ascii_chars += '/' ;

$ascii_chars += ',' ;
$ascii_chars += '%' ;
$ascii_chars += '_' ;
$ascii_chars += '>' ;
$ascii_chars += '?' ;
$ascii_chars += '`' ;
$ascii_chars += ':' ;
$ascii_chars += '#' ;
$ascii_chars += '@' ;
$ascii_chars += '\'';
$ascii_chars += '=' ;
$ascii_chars += '"' ;
$ascii_chars += '[' ;
$ascii_chars += ']' ;
$ascii_chars += '{' ;
$ascii_chars += '}' ;
$ascii_chars += '\\' ; # escape char is special - must be doubled

$ebcdic_nums = 0xF0.chr + 0XF1.chr + 0xF2.chr + 0xF3.chr + 0xF4.chr +
0xF5.chr + 0xF6.chr + 0xF7.chr + 0xF8.chr + 0xF9.chr ;
$ascii_nums = "0123456789" ;

$ebcdic_chars += $ebcdic_nums ;
$ascii_chars += $ascii_nums ;

# true if the record was
as expected, or false if not.
Record.report_values(data)

Record.dump_area( data ) ; # write offending record, in hex dump
format, to the INVALID_FORMAT file.

end ; # def doit()

instring = 0XC1.chr + 0xC2.chr + 0x6E.chr + 0xC1.chr + 0xC1.chr +
0x4C.chr + 0xC1.chr + 0xC3.chr ; # EBCDIC for "AB>AA<AC"

doit(instring) ; # run the program.

Todd Burch · May 10, 2007

Here's another twist.

If I set

instring = 0xF1.chr + 0xF2.chr + 0x4B.chr + 0xF3.chr + 0xF4.chr

which is ebcdic for "12.34", I get "45.67" in ascii. It looks like
something ( tr() maybe? ) is swallowing characters.

Todd

Todd Burch · May 10, 2007

Rob said:
where does the code X'5C' appear? (that's a \ in ASCII)

In ebcdic, a 0X5C is an asterisk. In ebcdic, a 0XE0 is an backslash.
Both are in from_string.

Todd

Todd Burch · May 10, 2007

unknown said:
So I don't think it's ALWAYS the case. What's the character before '>'
in your FROM_STRING?

-s

The character prior to the > is an underscore. Todd

Todd Burch · May 10, 2007

Robert said:
The backslash is wrong there.

Correct. I thought it was. I was grasping for straws.

Robert said:
A more robust way to code this would be to use a HashMap - at least to
initially associate ASCII with EBCDIC chars. So, I'd rather to

CHAR_MAP = {
0XC1 => ?A,
0xC2 => ?B,
# ...
}

Then you can do:

ebcdic, ascii = [CHAR_MAP.keys, CHAR_MAP.values].map do |set|
set.inject("") {|st, ch| st << ch}
end

Good suggestion. I haven't used HASHes at all. However (and maybe I
don't understand the exampe), but I added this to the very bottom of my
script:

char_map = {
0XC1 => ?A ,
0XC2 => ?B,
0XC3 => ?C,
0XC4 => ?D,
0X6E => ?>,
0XF1 => ?1,
0XF2 => ?2,
0XF3 => ?3,
0XF4 => ?4
}

ebcdic, ascii = [char_map.keys, char_map.values].map do |set|
set.inject(instring) {|st, ch| st << ch}
end

puts "ebcdic = #{ebcdic}" ;
puts "ascii = #{ascii}" ;
puts "instring = #{instring}" ;

and it produced some REALLY weird data. What did I do wrong?

Todd Burch · May 10, 2007

Todd said:
Here's another twist.

If I set

instring = 0xF1.chr + 0xF2.chr + 0x4B.chr + 0xF3.chr + 0xF4.chr

which is ebcdic for "12.34", I get "45.67" in ascii. It looks like
something ( tr() maybe? ) is swallowing characters.

Moving the numbers to the front of the FROM_STRING and the TO_STRING has
solved the problem with the 12.34 getting turned into 45.67 - but I
suspect I have just moved the error to someplace else.

Todd

Jeremy Hinegardner · May 10, 2007

Here's another twist.

If I set

instring = 0xF1.chr + 0xF2.chr + 0x4B.chr + 0xF3.chr + 0xF4.chr

which is ebcdic for "12.34", I get "45.67" in ascii. It looks like
something ( tr() maybe? ) is swallowing characters.

Have you tried just funnelling the data through some other tool? For
instance on *nix there's dd(1) which will do the conversion for you. Or
GNU's recode[1] which will take care of it, and the both take stdin and
stdout parameters.

I know you are on a windows machine, but if you have cygwin or the gnu
utilities for Win32 you should be able to take care of it via a system()
or %x{} call in ruby.

Here is a sample OSX version using Open3.

cat ebcdic2ascii.rb

#!/usr/bin/env ruby

require 'open3'
DD="/bin/dd conv=ascii"
RECODE="/opt/local/bin/recode EBCDIC..ASCII"
INPUT= 0xF1.chr + 0xF2.chr + 0x4B.chr + 0xF3.chr + 0xF4.chr

dd_output = nil
Open3.popen3(DD) do |stdin,stdout,stderr|
stdin.write(INPUT)
stdin.close
dd_output = stdout.read
end

recode_output = nil
Open3.popen3(RECODE) do |stdin,stdout,stderr|
stdin.write(INPUT)
stdin.close
recode_output = stdout.read
end

puts "#{"%10s %10s %10s" % %w(EBCDIC dd recode)}"
INPUT.size.times do |i|
e = "0x%02x" % INPUT
puts "#{"%10s %10s %10s" % [e,dd_output.chr,recode_output.chr]}"
end

ruby ebcdic2ascii.rb

Click to expand...

EBCDIC dd recode
0xf1 1 1
0xf2 2 2
0x4b . .
0xf3 3 3
0xf4 4 4

Some old ruby-talk[2] postings talk about using Iconv but that the code
sample in the msg doesn't seem to work anymore

enjoy,

-jeremy

1 - http://www.gnu.org/software/recode/recode.html
2 - http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/66280

Todd Burch · May 10, 2007

Jeremy said:
Have you tried just funnelling the data through some other tool?

No, and I would rather not. Your example seems pretty thorough.

I think I've narrowed it down to string#tr is broken. I rebuilt my
translate strings with all hex codes for non-alphanumeric characters,
like this:

$ebcdic_nums = 0xF0.chr + 0XF1.chr + 0xF2.chr + 0xF3.chr + 0xF4.chr +
0xF5.chr + 0xF6.chr + 0xF7.chr + 0xF8.chr + 0xF9.chr ;
$ascii_nums = "0123456789" ;

$ebcdic_chars = $ebcdic_nums ;
$ascii_chars = $ascii_nums ;

$ebcdic_chars += 0x40.chr ; # blank
$ascii_chars += 0x20.chr ;

###$ebcdic_chars += 0xFF.chr ; # unprintable
###$ebcdic_chars += 0x00.chr ; # unprintable
###$ascii_chars += " " ; # make ebcdic 0XFF into a blank
###$ascii_chars += ' ' ; # make ebcdic 0X00 into a blank

$ebcdic_chars += 0XC1.chr + 0xC2.chr + 0xC3.chr + 0xC4.chr + 0xC5.chr +
0xC6.chr + 0xC7.chr + 0xC8.chr + 0xC9.chr ; # A-I
$ebcdic_chars += 0XD1.chr + 0xD2.chr + 0xD3.chr + 0xD4.chr + 0xD5.chr +
0xD6.chr + 0xD7.chr + 0xD8.chr + 0xD9.chr ; # J-R
$ebcdic_chars += 0xE2.chr + 0xE3.chr + 0xE4.chr + 0xE5.chr +
0xE6.chr + 0xE7.chr + 0xE8.chr + 0xE9.chr ; # S-Z
$ebcdic_chars += 0X81.chr + 0x82.chr + 0x83.chr + 0x84.chr + 0x85.chr +
0x86.chr + 0x87.chr + 0x88.chr + 0x89.chr ; # a-i
$ebcdic_chars += 0X91.chr + 0x92.chr + 0x93.chr + 0x94.chr + 0x95.chr +
0x96.chr + 0x97.chr + 0x98.chr + 0x99.chr ; # j-r
$ebcdic_chars += 0xA2.chr + 0xA3.chr + 0xA4.chr + 0xA5.chr +
0xA6.chr + 0xA7.chr + 0xA8.chr + 0xA9.chr ; # s-z

$ascii_chars += ('A'..'Z').to_a.to_s ; # A-Z
$ascii_chars += ('a'..'z').to_a.to_s ; # a-z

$ebcdic_chars += 0x4B.chr ; # .
$ascii_chars += 0x2E.chr ;

$ebcdic_chars += 0x6E.chr ; # >
$ascii_chars += 0x3E.chr ;

$ebcdic_chars += 0x4C.chr ; # <
$ascii_chars += 0x3C.chr ;

$ebcdic_chars += 0x4D.chr ; # (
$ascii_chars += 0x28.chr ;

$ebcdic_chars += 0x5D.chr ; # )
$ascii_chars += 0x29.chr ;

$ebcdic_chars += 0x4E.chr ; # +
$ascii_chars += 0x2B.chr ;

$ebcdic_chars += 0x4F.chr ; # |
$ascii_chars += 0x7C.chr ;

$ebcdic_chars += 0x5A.chr ; # !
$ascii_chars += 0x21.chr ;

$ebcdic_chars += 0x5B.chr ; # $
$ascii_chars += 0x24.chr ;

$ebcdic_chars += 0x5C.chr ; # *
$ascii_chars += 0x2A.chr ;

$ebcdic_chars += 0x5E.chr ; # ;
$ascii_chars += 0x3B.chr ;

$ebcdic_chars += 0x5F.chr ; # ^
$ascii_chars += 0x5E.chr ;

#$ebcdic_chars += 0x60.chr ; # -
#$ascii_chars += 0x2D.chr ;

$ebcdic_chars += 0x61.chr ; # /
$ascii_chars += 0x2F.chr ;

$ebcdic_chars += 0x6B.chr ; # ,
$ascii_chars += 0x2C.chr ;

$ebcdic_chars += 0x6C.chr ; # %
$ascii_chars += 0x25.chr ;

$ebcdic_chars += 0x6D.chr ; # _
$ascii_chars += 0x5F.chr ;

$ebcdic_chars += 0x6F.chr ; # ?
$ascii_chars += 0x3F.chr ;

$ebcdic_chars += 0x79.chr ; # `
$ascii_chars += 0x60.chr ;

$ebcdic_chars += 0x7A.chr ; # :
$ascii_chars += 0x3A.chr ;

$ebcdic_chars += 0x7B.chr ; # #
$ascii_chars += 0x23.chr ;

$ebcdic_chars += 0x7C.chr ; # @
$ascii_chars += 0x40.chr ;

$ebcdic_chars += 0x7D.chr ; # '
$ascii_chars += 0x27.chr ;

$ebcdic_chars += 0x7E.chr ; # =
$ascii_chars += 0x3D.chr ;

$ebcdic_chars += 0x7F.chr ; # "
$ascii_chars += 0x22.chr ;

$ebcdic_chars += 0xBA.chr ; # [
$ascii_chars += 0x5B.chr ;

$ebcdic_chars += 0xBB.chr ; # ]
$ascii_chars += 0x5D.chr ;

$ebcdic_chars += 0xC0.chr ; # {
$ascii_chars += 0x7B.chr ;

$ebcdic_chars += 0xD0.chr ; # }
$ascii_chars += 0x7D.chr ;

$ebcdic_chars += 0xE0.chr ; # \
$ascii_chars += 0x5C.chr ;

$ebcdic_chars += 0x50.chr ; # &
$ascii_chars += 0x26.chr ;

I derived the ASCII character codes by writing this 1 line script:

puts ARGV[0].unpack('H64').to_s.upcase ;

Then, I enter:

ruby charcnvt.rb '-'

and it responds that the hex code for minus (hyphen) is 0x2D. Ok, fine.
However, then I do the translate, and it converts my ebcdic 0x60 into
ascii 0x2C (a comma) instead of 0x2D.

Todd Burch · May 10, 2007

(up above, I had the 0x60 / 0x2D pair commented out while I was testing
other possibilities. When not commented out, it failed)

I still havent ruled out the possibility of another character hosing the
string. I'll reorder my strings in ASCII sequence and see if that
changes anything.

This is killing me.

Rob Biedenharn · May 10, 2007

Here a whole program that illlustrates the error. I'm running under
Windows Ruby 1.8.5. (it's ugly and chopped up, but it works... er...
doesn't work... er... you know what I mean)

Todd

First, damn you for getting me interested in this ;-)

Here's a whole program (with a couple tests!) that uses an array to
map from an EBCDIC index to a single character ASCII string (easier
to construct than the character value).

I hope you find some useful idioms in there.

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)

#!/usr/bin/env ruby -w
#
# Ruby program to convert a z/OS EBCDIC to ASCII.
#

class Record
EBCDIC_TO_ASCII = Array.new(256) # the EBCDIC position holds the
ASCII value

EBCDIC_TO_ASCII[0x40] = ' '

EBCDIC_TO_ASCII[0x81..0x89] = [*'a'..'i']
EBCDIC_TO_ASCII[0x91..0x99] = [*'j'..'r']
EBCDIC_TO_ASCII[0xA2..0xA9] = [*'s'..'z']
EBCDIC_TO_ASCII[0xC1..0xC9] = [*'A'..'I']
EBCDIC_TO_ASCII[0xD1..0xD9] = [*'J'..'R']
EBCDIC_TO_ASCII[0xE2..0xE9] = [*'S'..'Z']

EBCDIC_TO_ASCII[0x4B] = '.'
EBCDIC_TO_ASCII[0x4C] = '<'
EBCDIC_TO_ASCII[0x4D] = '('
EBCDIC_TO_ASCII[0x4E] = '+'
EBCDIC_TO_ASCII[0x4F] = '|'
EBCDIC_TO_ASCII[0x50] = '&'

EBCDIC_TO_ASCII[0x5A] = '!'
EBCDIC_TO_ASCII[0x5B] = '$'
EBCDIC_TO_ASCII[0x5C] = '*'
EBCDIC_TO_ASCII[0x5D] = ')'
EBCDIC_TO_ASCII[0x5E] = ';'
EBCDIC_TO_ASCII[0x5F] = '^'
EBCDIC_TO_ASCII[0x60] = '-'
EBCDIC_TO_ASCII[0x61] = '/'

EBCDIC_TO_ASCII[0x6B] = ','
EBCDIC_TO_ASCII[0x6C] = '%'
EBCDIC_TO_ASCII[0x6D] = '_'
EBCDIC_TO_ASCII[0x6E] = '>'
EBCDIC_TO_ASCII[0x6F] = '?'
EBCDIC_TO_ASCII[0X79] = '`'
EBCDIC_TO_ASCII[0x7A] = ':'
EBCDIC_TO_ASCII[0x7B] = '#'
EBCDIC_TO_ASCII[0x7C] = '@'
EBCDIC_TO_ASCII[0x7D] = '\''
EBCDIC_TO_ASCII[0x7E] = '='
EBCDIC_TO_ASCII[0x7F] = '"'
EBCDIC_TO_ASCII[0xBA] = '['
EBCDIC_TO_ASCII[0xBB] = ']'
EBCDIC_TO_ASCII[0xC0] = '{'
EBCDIC_TO_ASCII[0xD0] = '}'
EBCDIC_TO_ASCII[0xE0] = '\\'

EBCDIC_TO_ASCII[0xF0..0xF9] = [*'0'..'9']

def self.to_ascii str
str.split(//).map {|e| EBCDIC_TO_ASCII[e[0]] || 0.chr }.join
end

# This Record.to_h() method converts a hex byte into a
displayable, human
# readable value. i.e. X'F5' to "F5".
#
def self.to_h str, chunk=32
str.unpack('H*').first.scan(/.{2,#{2*chunk}}/).join("\n")
end

#
# This method called to dump the data in hex format.
#
def self.dump_area data, chunk=16
fsize = data.size
lines,last = fsize.divmod(chunk) # get # of full lines and
figure the remainder.
i = 0
lines.downto(0) do
str = data[i,chunk]
hexdata = str.unpack('H*').first.scan(/.{2,8}/).join(" ")
#.upcase
printf "%08X %s %s\n", i, hexdata, to_ascii(str).tr("\000-
\037\177-\377",'.')
i += chunk
end
end
end

if __FILE__ == $0
require 'test/unit'

puts "EBCDIC Table"
Record.dump_area((0...Record::EBCDIC_TO_ASCII.size).map{|e|
e.chr}.join)

class EbcdicTest < Test::Unit::TestCase
def setup
@instring_ascii = "AB>AA<AC"
# EBCDIC for "AB>AA<AC"
@instring_ebcdic = 0XC1.chr + 0xC2.chr + 0x6E.chr + 0xC1.chr +
0xC1.chr + 0x4C.chr + 0xC1.chr + 0xC3.chr
end

def test_simple
assert_equal @instring_ascii, Record.to_ascii(@instring_ebcdic)
end
def test_numbers
@expected = "$1,234.95"
# $ 1 , 2 3 4 . 9 5
@input = [0x5B, 0xF1, 0x6B, 0xF2, 0xF3, 0xF4, 0x4B, 0xF9,
0xF5].map {|e| e.chr}.join
assert_equal @expected, Record.to_ascii(@input)
end
end
end
__END__

Pit Capitain · May 10, 2007

Todd said:
Here a whole program that illlustrates the error. I'm running under
Windows Ruby 1.8.5. (it's ugly and chopped up, but it works... er...
doesn't work... er... you know what I mean)

Hi Todd,

first, in case you don't know, you don't have to terminate statements
with ";". Second, it's better to fill $ebcdic_chars and $ascii_chars
once at the beginning of your program and not every time the method
#doit is called.

Now to your problem: for #tr the letter "-" is special. It lets you
define character ranges. See the docs for more info. In order to get a
literal "-" you have to escape it with a "\". The backslash is the
second part of your problem. You have a single backslash both in
$ebcdic_chars and in $ascii_chars, and you have to escape that, too.

One way to fix your code would be to double the line

$ebcdic_chars += 0x5C.chr # *

so that you get two backslashes (0x5C) in a row, change the line

$ascii_chars += '-'

to

$ascii_chars += '\\-'

and finally change the line

$ascii_chars += '\\' # escape char is special - must be doubled

to

$ascii_chars += '\\\\' # escape char is special - must be doubled

Note that '\\'.size == 1.

Regards,
Pit

Todd Burch · May 10, 2007

Pit - you are DA MAN!!! WOO-HOO!!! Success with just the few things
you pointed out did the trick!! I am one very happy geek!!
WOO-HOO!!!!

Pit said:
first, in case you don't know, you don't have to terminate statements
with ";".

Yes, I know. It's a habit. I write in so many languages where it is
either optional or required, I simply choose to use it all the time.
Sometimes, I don't know my fingers even added it.

Pit said:
Second, it's better to fill $ebcdic_chars and $ascii_chars
once at the beginning of your program and not every time the method
#doit is called.

Absolutely. This was chopped up from my much larger program. And even
then, #doit is only called once. My plan is to take these tables out of
global variables and make them constants in another Module.

So, are you somewhere were I could buy you a cold beverage? It would be
my pleasure!

Todd (Katy, Texas)

extension_pack	0	Jan 5, 2006
a little trouble passing values to Oracle using bind variables	0	May 17, 2005
Question on initializing STL hash-maps using Visual C++ Express	2	Oct 17, 2010
Using Rdoc when auto-creating methods	1	Mar 27, 2006
Using and counting assertions from a module	0	Aug 22, 2005
problem executing bash script from ruby - resend	2	Jul 15, 2009
[ANN] Halcyon 0.3.18 Release	0	Dec 31, 2007
WIN32OLE doesn't seem to support UTF-8.	2	Sep 8, 2005

Trouble using string.tr()

Todd Burch

Alex LeDonne

Robert Klemme

Todd Burch

Todd Burch

Robert Klemme

Rob Biedenharn

Todd Burch

Todd Burch

Todd Burch

Todd Burch

Todd Burch

Todd Burch

Todd Burch

Jeremy Hinegardner

Todd Burch

Todd Burch

Rob Biedenharn

Pit Capitain

Todd Burch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads