Reading and writing extended ascii characters

Geoff Warnock · Mar 9, 2005

I am writing a Java program (using the jdk 1.5.0) for a project. It
has to read in words from an input file. It is a language translator
so that each line consists of an english word, german, french, spanish
etc. Some of these words contain accents and other characters which
are ascii values of beyond 127.

My problem is that i have a punctuation removal method which strips
all words of unnecessary characters before adding each line to a new
Binary Search Tree Node. The method just drops the ascii values
greater than 127 and does not store them. Why is this happening?

for(int i = 0; i < data.length; i++)
{
int j =0;
while(j != data.length())
{
c = data.charAt(j);
if((c <= 'z') && (c >= 'a'))
temp = temp + c;
else if((c <= 'Z') && (c >= 'A'))
temp = temp + c;
else if(c == 39)
temp = temp + c;
else if((c > (char)127) && (c < (char)168))
temp = temp + c;
else if(c == ' ')
{
if((j > 0) && (j < data.length() - 1))
lastChar = data.charAt(j-1);
nextChar = data.charAt(j+1);
if(((lastChar <= 'z' && lastChar >= 'a') || (lastChar <= 'Z' &&
lastChar >= 'A') || (lastChar == 39)) && ((nextChar <=
'z' && nextChar >= 'a') || (nextChar <= 'Z' && nextChar <= 'A') ||
(nextChar == 39)))
temp = temp + c;
}
}
}
j++;
}
data = temp;
temp = "";

It is the "else if((c > (char)127) && (c < (char)168)) that will not
work. Any ideas??

Thanks, Geoff Warnock.

Gordon Beaton · Mar 9, 2005

My problem is that i have a punctuation removal method which strips
all words of unnecessary characters before adding each line to a new
Binary Search Tree Node. The method just drops the ascii values
greater than 127 and does not store them. Why is this happening?
[...]

It is the "else if((c > (char)127) && (c < (char)168)) that will not
work. Any ideas??

Probably it has to do with how "c" and "data" are declared or
initialized, but you didn't post those parts.

I'll bet that you've read your data from the file without specifying
the correct character encoding. Have you confirmed (e.g. with
System.out.println()) that lines containing those characters have been
read correctly?

That said, I'd like to suggest that instead of doing it "the hard
way", and making assumptions about the values of various characters,
you consider using a simple test like this:

if (Character.isLetter(c)) {
}

/gordon

Daniel Tryba · Mar 9, 2005

Geoff Warnock said:
It is the "else if((c > (char)127) && (c < (char)168)) that will not
work. Any ideas??

How is it not working?

BTW tmp is a string. Looping and appending one char at a time to it is
very inefficient

BTW2 your input is a string, you should look at reglar expression to
filter out unwanted characters.

BTW3 Strings are Unicode, your source is something else (most likely),
so anything you read gets translated to Unicode. On most platforms the
default system encoding is iso-88591-1, which doesn't have values for
values >127 and < 160

Secure Keyboard v2.0 Modern C++ Virtual Keyboard for Windows (Glassmorphism UI, Clipboard Auto-Clear)	0	Mar 26, 2026
Boomer trying to learn coding in C and C++	6	Dec 16, 2022
extended ASCII Conversion in Java	0	Jan 2, 2013
print out ascii characters	31	Nov 20, 2009
2 JK Circuit in VHDL	0	Mar 29, 2019
AES-128 Clipboard Protector: Auto-Encrypt Ctrl+C, Smart-Decrypt Ctrl+V (C++ Windows Hook)	7	Mar 24, 2026
Need Help: Program to Accept 2 Matrices and Show their Sum	0	Aug 21, 2022
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022

Reading and writing extended ascii characters

Geoff Warnock

Gordon Beaton

Daniel Tryba

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads