I discovered this post:
http://www.ibm.com/developerworks/library/x-tipsaxxni/
and implemented both approaches (SAX and Xerces XNI).
Unfortunately, for the attached XML file, both methods
output an encoding of UTF-8, while looking at the file
makes it clear that it is not UTF-8 encoded (all characters,
including the umlaut and the Euro-sign, take one byte, and the
declared encoding also is not UTF-8).
Does anyone have an idea why that is so? And how I could
go about making some XML parser determine the correct encoding?