nio charset doubt

J

jimgardener

hi
i tried using nio.charset classes for decoding contents of a text file
The textfile 'samplein.txt' has 3 lines as below>>
first
second
third

i wrote this code

import java.nio.*;
import java.nio.charset.*;
import java.io.*;
import java.nio.channels.*;

public class CharsetDemo {


public static void main(String[] args) {
String inputfile = "samplein.txt";

try{
RandomAccessFile inf = new RandomAccessFile( inputfile, "r" );

long leninf=inf.length();
debug("leninf:"+leninf);
FileChannel inc = inf.getChannel();
MappedByteBuffer mapbuf=inc.map(FileChannel.MapMode.READ_ONLY, 0,
leninf);

Charset latin1 = Charset.forName( "ISO-8859-1" );
CharsetDecoder decoder = latin1.newDecoder();
CharBuffer charbuf=decoder.decode(mapbuf);
debug("cbarraylen:"+charbuf.array().length);

for(char i:charbuf.array()){
System.out.print(i+"+");
}


}catch(Exception e){
e.printStackTrace();
}

}
public static void debug(String msg){
System.out.println(msg);
}

}


when i run this i get this output>>

leninf:20
cbarraylen:20
f+i+r+s+t+
+
+s+e+c+o+n+d+
+
+t+h+i+r+d+

i have 2 doubts,
there are total 16 characters and 2 newline chars.Then how is it that
the length of RandomAccessFile and charbuffer array 20?

I am wondering how the + before s in 'second' is printed. the +
between 'f+i+r+s+t+' and '+s+e+c+o+n+d+' must be printed when
newline character is encountered by the for loop's i variable.But i
can't make out where the extra + (before s) is coming from

can someone make it clear?
jim
 
S

Silvio Bierman

jimgardener said:
hi
i tried using nio.charset classes for decoding contents of a text file
The textfile 'samplein.txt' has 3 lines as below>>
first
second
third

i wrote this code

import java.nio.*;
import java.nio.charset.*;
import java.io.*;
import java.nio.channels.*;

public class CharsetDemo {


public static void main(String[] args) {
String inputfile = "samplein.txt";

try{
RandomAccessFile inf = new RandomAccessFile( inputfile, "r" );

long leninf=inf.length();
debug("leninf:"+leninf);
FileChannel inc = inf.getChannel();
MappedByteBuffer mapbuf=inc.map(FileChannel.MapMode.READ_ONLY, 0,
leninf);

Charset latin1 = Charset.forName( "ISO-8859-1" );
CharsetDecoder decoder = latin1.newDecoder();
CharBuffer charbuf=decoder.decode(mapbuf);
debug("cbarraylen:"+charbuf.array().length);

for(char i:charbuf.array()){
System.out.print(i+"+");
}


}catch(Exception e){
e.printStackTrace();
}

}
public static void debug(String msg){
System.out.println(msg);
}

}


when i run this i get this output>>

leninf:20
cbarraylen:20
f+i+r+s+t+
+
+s+e+c+o+n+d+
+
+t+h+i+r+d+

i have 2 doubts,
there are total 16 characters and 2 newline chars.Then how is it that
the length of RandomAccessFile and charbuffer array 20?

I am wondering how the + before s in 'second' is printed. the +
between 'f+i+r+s+t+' and '+s+e+c+o+n+d+' must be printed when
newline character is encountered by the for loop's i variable.But i
can't make out where the extra + (before s) is coming from

can someone make it clear?
jim

You are running this on Windows and have both CR + LF line separators in
the file?
 
R

RedGrittyBrick

jimgardener said:
hi
i tried using nio.charset classes for decoding contents of a text file
The textfile 'samplein.txt' has 3 lines as below>>
first
second
third

i wrote this code

import java.nio.*;
import java.nio.charset.*;
import java.io.*;
import java.nio.channels.*;

public class CharsetDemo {


public static void main(String[] args) {
String inputfile = "samplein.txt";

try{
RandomAccessFile inf = new RandomAccessFile( inputfile, "r" );

long leninf=inf.length();
debug("leninf:"+leninf);
FileChannel inc = inf.getChannel();
MappedByteBuffer mapbuf=inc.map(FileChannel.MapMode.READ_ONLY, 0,
leninf);

Charset latin1 = Charset.forName( "ISO-8859-1" );
CharsetDecoder decoder = latin1.newDecoder();
CharBuffer charbuf=decoder.decode(mapbuf);
debug("cbarraylen:"+charbuf.array().length);

for(char i:charbuf.array()){
System.out.print(i+"+");

if ( (int)i < 32 )
System.out.print( (int)i );
else
System.out.print(i);
System.out.print('+');
}


}catch(Exception e){
e.printStackTrace();
}

}
public static void debug(String msg){
System.out.println(msg);
}

}


when i run this i get this output>>

leninf:20
cbarraylen:20
f+i+r+s+t+
+
+s+e+c+o+n+d+
+
+t+h+i+r+d+

i have 2 doubts,

(These are "questions", not "doubts" in Western English)
there are total 16 characters and 2 newline chars.Then how is it that
the length of RandomAccessFile and charbuffer array 20?

Make the changes noted above to see why. Consult an ASCII chart.
I am wondering how the + before s in 'second' is printed. the +
between 'f+i+r+s+t+' and '+s+e+c+o+n+d+' must be printed when
newline character is encountered by the for loop's i variable.But i
can't make out where the extra + (before s) is coming from

Make the changes noted above to see where.
can someone make it clear?

Yes.
http://en.wikipedia.org/wiki/Newline
 
J

jimgardener

You are running this on Windows and have both CR + LF line separators in
the file?

ok..that must be it! thanks silvio

i printed the int values of characters and they show values 13 and 10
twice..viz CR and LF

thanks
jim
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top