Encoding, "extended ansi", and unicode in 1.9

D

Dennis Nedry

I have a routine for converting ansi with "extended" ibm characters to
html. It is as follows...

EXTENDED_ANSI_TABLE = {
227.chr => "<br>",
32.chr => "&nbsp;",
128.chr => "&Ccedil;", #128 C, cedilla (199)
129.chr => "&uuml;", #129 u, umlaut (252)
130.chr => "&eacute;", #130 e, acute accent (233)
131.chr => "&acirc;", #131 a, circumflex accent (226)
132.chr => "&auml;", #132 a, umlaut (228)
133.chr => "&agrave;", #133 a, grave accent (224)
134.chr => "&aring;", #134 a, ring (229)
135.chr => "&ccedil;", #135 c, cedilla (231)
136.chr => "&ecirc;", #136 e, circumflex accent (234)
137.chr => "&euml;", #137 e, umlaut (235)
138.chr => "&egrave;", #138 e, grave accent (232)
139.chr => "&iuml;", #139 i, umlaut (239)
140.chr => "&icirc;", #140 i, circumflex accent (238)
141.chr => "&igrave;", #141 i, grave accent (236)
#big huge list continues for pages...
}


def parse_ansi_ext(str)

EXTENDED_ANSI_TABLE.each_pair {|color, result|
str = str.gsub(color,result)
}
return str
end

This worked in 1.8, no problem.

If the input contains a character above 127.chr, it now bombs with the error:

"Encoding::CompatibilityError at /
incompatible encoding regexp match (ASCII-8BIT regexp with ISO-8859-1 string)"

I've tried various acts of desperation to fix it, to no avail. I
don't understand exactly what is wrong...

Thanks,

Dennis
 
M

Michael Fellinger

I have a routine for converting ansi with "extended" ibm characters to
html. =C2=A0It is as follows...

EXTENDED_ANSI_TABLE =3D {
=C2=A0 =C2=A0 =C2=A0 =C2=A0227.chr =3D> "<br>",
=C2=A0 =C2=A0 =C2=A0 =C2=A032.chr =3D> "&nbsp;",
=C2=A0 =C2=A0 =C2=A0 =C2=A0128.chr =3D> "&Ccedil;", =C2=A0#128 C, cedilla= (199)
=C2=A0 =C2=A0 =C2=A0 =C2=A0129.chr =3D> "&uuml;", =C2=A0 =C2=A0#129 u, um= laut (252)
=C2=A0 =C2=A0 =C2=A0 =C2=A0130.chr =3D> "&eacute;", =C2=A0#130 e, acute a= ccent (233)
=C2=A0 =C2=A0 =C2=A0 =C2=A0131.chr =3D> "&acirc;", =C2=A0 #131 a, circumf= lex accent (226)
=C2=A0 =C2=A0 =C2=A0 =C2=A0132.chr =3D> "&auml;", =C2=A0 =C2=A0#132 a, um= laut =C2=A0(228)
=C2=A0 =C2=A0 =C2=A0 =C2=A0133.chr =3D> "&agrave;", =C2=A0#133 a, grave a= ccent (224)
=C2=A0 =C2=A0 =C2=A0 =C2=A0134.chr =3D> "&aring;", =C2=A0 #134 a, ring (2= 29)
=C2=A0 =C2=A0 =C2=A0 =C2=A0135.chr =3D> "&ccedil;", =C2=A0#135 c, cedilla= (231)
=C2=A0 =C2=A0 =C2=A0 =C2=A0136.chr =3D> "&ecirc;", =C2=A0 #136 e, circumf= lex accent (234)
=C2=A0 =C2=A0 =C2=A0 =C2=A0137.chr =3D> "&euml;", =C2=A0 =C2=A0#137 e, um= laut (235)
=C2=A0 =C2=A0 =C2=A0 =C2=A0138.chr =3D> "&egrave;", =C2=A0#138 e, grave a= ccent (232)
=C2=A0 =C2=A0 =C2=A0 =C2=A0139.chr =3D> "&iuml;", =C2=A0 =C2=A0#139 i, um= laut (239)
=C2=A0 =C2=A0 =C2=A0 =C2=A0140.chr =3D> "&icirc;", =C2=A0 #140 i, circumf= lex accent (238)
=C2=A0 =C2=A0 =C2=A0 =C2=A0141.chr =3D> "&igrave;", =C2=A0#141 i, grave a= ccent (236)
=C2=A0#big huge list continues for pages...
}


=C2=A0 =C2=A0 =C2=A0 =C2=A0def parse_ansi_ext(str)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0EXTENDED_ANSI_TABLE.each_pair {|color, result|
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0str =3D str.gsub(color,result)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return str
=C2=A0 =C2=A0 =C2=A0 =C2=A0end

This worked in 1.8, no problem.

If the input contains a character above 127.chr, it now bombs with the er= ror:

"Encoding::CompatibilityError at /
incompatible encoding regexp match (ASCII-8BIT regexp with ISO-8859-1 str= ing)"

I've tried various acts of desperation to fix it, to no avail. =C2=A0I
don't understand exactly what is wrong...

str has the encoding ISO-8859-1, probably inherited from your system locale=
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,007
Latest member
obedient dusk

Latest Threads

Top