nio charset doubt

Discussion in 'Java' started by jimgardener, Jul 2, 2008.

  1. jimgardener

    jimgardener Guest

    hi
    i tried using nio.charset classes for decoding contents of a text file
    The textfile 'samplein.txt' has 3 lines as below>>
    first
    second
    third

    i wrote this code

    import java.nio.*;
    import java.nio.charset.*;
    import java.io.*;
    import java.nio.channels.*;

    public class CharsetDemo {


    public static void main(String[] args) {
    String inputfile = "samplein.txt";

    try{
    RandomAccessFile inf = new RandomAccessFile( inputfile, "r" );

    long leninf=inf.length();
    debug("leninf:"+leninf);
    FileChannel inc = inf.getChannel();
    MappedByteBuffer mapbuf=inc.map(FileChannel.MapMode.READ_ONLY, 0,
    leninf);

    Charset latin1 = Charset.forName( "ISO-8859-1" );
    CharsetDecoder decoder = latin1.newDecoder();
    CharBuffer charbuf=decoder.decode(mapbuf);
    debug("cbarraylen:"+charbuf.array().length);

    for(char i:charbuf.array()){
    System.out.print(i+"+");
    }


    }catch(Exception e){
    e.printStackTrace();
    }

    }
    public static void debug(String msg){
    System.out.println(msg);
    }

    }


    when i run this i get this output>>

    leninf:20
    cbarraylen:20
    f+i+r+s+t+
    +
    +s+e+c+o+n+d+
    +
    +t+h+i+r+d+

    i have 2 doubts,
    there are total 16 characters and 2 newline chars.Then how is it that
    the length of RandomAccessFile and charbuffer array 20?

    I am wondering how the + before s in 'second' is printed. the +
    between 'f+i+r+s+t+' and '+s+e+c+o+n+d+' must be printed when
    newline character is encountered by the for loop's i variable.But i
    can't make out where the extra + (before s) is coming from

    can someone make it clear?
    jim
     
    jimgardener, Jul 2, 2008
    #1
    1. Advertising

  2. jimgardener wrote:
    > hi
    > i tried using nio.charset classes for decoding contents of a text file
    > The textfile 'samplein.txt' has 3 lines as below>>
    > first
    > second
    > third
    >
    > i wrote this code
    >
    > import java.nio.*;
    > import java.nio.charset.*;
    > import java.io.*;
    > import java.nio.channels.*;
    >
    > public class CharsetDemo {
    >
    >
    > public static void main(String[] args) {
    > String inputfile = "samplein.txt";
    >
    > try{
    > RandomAccessFile inf = new RandomAccessFile( inputfile, "r" );
    >
    > long leninf=inf.length();
    > debug("leninf:"+leninf);
    > FileChannel inc = inf.getChannel();
    > MappedByteBuffer mapbuf=inc.map(FileChannel.MapMode.READ_ONLY, 0,
    > leninf);
    >
    > Charset latin1 = Charset.forName( "ISO-8859-1" );
    > CharsetDecoder decoder = latin1.newDecoder();
    > CharBuffer charbuf=decoder.decode(mapbuf);
    > debug("cbarraylen:"+charbuf.array().length);
    >
    > for(char i:charbuf.array()){
    > System.out.print(i+"+");
    > }
    >
    >
    > }catch(Exception e){
    > e.printStackTrace();
    > }
    >
    > }
    > public static void debug(String msg){
    > System.out.println(msg);
    > }
    >
    > }
    >
    >
    > when i run this i get this output>>
    >
    > leninf:20
    > cbarraylen:20
    > f+i+r+s+t+
    > +
    > +s+e+c+o+n+d+
    > +
    > +t+h+i+r+d+
    >
    > i have 2 doubts,
    > there are total 16 characters and 2 newline chars.Then how is it that
    > the length of RandomAccessFile and charbuffer array 20?
    >
    > I am wondering how the + before s in 'second' is printed. the +
    > between 'f+i+r+s+t+' and '+s+e+c+o+n+d+' must be printed when
    > newline character is encountered by the for loop's i variable.But i
    > can't make out where the extra + (before s) is coming from
    >
    > can someone make it clear?
    > jim


    You are running this on Windows and have both CR + LF line separators in
    the file?
     
    Silvio Bierman, Jul 2, 2008
    #2
    1. Advertising

  3. jimgardener wrote:
    > hi
    > i tried using nio.charset classes for decoding contents of a text file
    > The textfile 'samplein.txt' has 3 lines as below>>
    > first
    > second
    > third
    >
    > i wrote this code
    >
    > import java.nio.*;
    > import java.nio.charset.*;
    > import java.io.*;
    > import java.nio.channels.*;
    >
    > public class CharsetDemo {
    >
    >
    > public static void main(String[] args) {
    > String inputfile = "samplein.txt";
    >
    > try{
    > RandomAccessFile inf = new RandomAccessFile( inputfile, "r" );
    >
    > long leninf=inf.length();
    > debug("leninf:"+leninf);
    > FileChannel inc = inf.getChannel();
    > MappedByteBuffer mapbuf=inc.map(FileChannel.MapMode.READ_ONLY, 0,
    > leninf);
    >
    > Charset latin1 = Charset.forName( "ISO-8859-1" );
    > CharsetDecoder decoder = latin1.newDecoder();
    > CharBuffer charbuf=decoder.decode(mapbuf);
    > debug("cbarraylen:"+charbuf.array().length);
    >
    > for(char i:charbuf.array()){
    > System.out.print(i+"+");


    if ( (int)i < 32 )
    System.out.print( (int)i );
    else
    System.out.print(i);
    System.out.print('+');

    > }
    >
    >
    > }catch(Exception e){
    > e.printStackTrace();
    > }
    >
    > }
    > public static void debug(String msg){
    > System.out.println(msg);
    > }
    >
    > }
    >
    >
    > when i run this i get this output>>
    >
    > leninf:20
    > cbarraylen:20
    > f+i+r+s+t+
    > +
    > +s+e+c+o+n+d+
    > +
    > +t+h+i+r+d+
    >
    > i have 2 doubts,


    (These are "questions", not "doubts" in Western English)

    > there are total 16 characters and 2 newline chars.Then how is it that
    > the length of RandomAccessFile and charbuffer array 20?


    Make the changes noted above to see why. Consult an ASCII chart.

    > I am wondering how the + before s in 'second' is printed. the +
    > between 'f+i+r+s+t+' and '+s+e+c+o+n+d+' must be printed when
    > newline character is encountered by the for loop's i variable.But i
    > can't make out where the extra + (before s) is coming from


    Make the changes noted above to see where.

    >
    > can someone make it clear?


    Yes.
    http://en.wikipedia.org/wiki/Newline



    --
    RGB
     
    RedGrittyBrick, Jul 2, 2008
    #3
  4. jimgardener

    jimgardener Guest


    > You are running this on Windows and have both CR + LF line separators in
    > the file?


    ok..that must be it! thanks silvio

    i printed the int values of characters and they show values 13 and 10
    twice..viz CR and LF

    thanks
    jim
     
    jimgardener, Jul 2, 2008
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Stefano

    nio and default charset

    Stefano, Jun 4, 2004, in forum: Java
    Replies:
    1
    Views:
    501
    Gordon Beaton
    Jun 4, 2004
  2. iksrazal

    NIO with timeouts != NIO?

    iksrazal, Jun 17, 2004, in forum: Java
    Replies:
    1
    Views:
    6,422
    iksrazal
    Jun 18, 2004
  3. Replies:
    0
    Views:
    3,478
  4. Bob Nelson

    doubt about doubt

    Bob Nelson, Jul 28, 2006, in forum: C Programming
    Replies:
    11
    Views:
    660
  5. optimistx

    javascript charset <> page charset

    optimistx, Aug 14, 2008, in forum: Javascript
    Replies:
    2
    Views:
    305
    optimistx
    Aug 15, 2008
Loading...

Share This Page