H
Harald Kirsch
Consider a file containing variable length records separated
by a freely chosen unicode character. The encoding of the file is not
necessarily one where one character is represented by a fixed number
of bytes. In particular the encoding cannot be hard-coded in the software.
Is there an easy way to get hold of the seek-positions of the records
in order to create an index pointing to the records?
What makes this particularly difficult is the combination of non-fixed
character code length and buffering in a CharsetDecoder.
The solution I can think of involves very clumsy stuff with driving a
CharsetDecoder one character at a time. Am I missing something?
Should this not be a few calls to available library methods?
Any ideas,
Harald.
by a freely chosen unicode character. The encoding of the file is not
necessarily one where one character is represented by a fixed number
of bytes. In particular the encoding cannot be hard-coded in the software.
Is there an easy way to get hold of the seek-positions of the records
in order to create an index pointing to the records?
What makes this particularly difficult is the combination of non-fixed
character code length and buffering in a CharsetDecoder.
The solution I can think of involves very clumsy stuff with driving a
CharsetDecoder one character at a time. Am I missing something?
Should this not be a few calls to available library methods?
Any ideas,
Harald.