encoding problem

Discussion in 'ASP .Net' started by Jim Lawton, Jan 11, 2005.

  1. Jim Lawton

    Jim Lawton Guest

    Hi,

    ..net c# httphandler straight html form at browser.

    GBP pound sign problem (I know I know - I *can* decode it, but I've got to
    understand what and why I should be doing stuff)

    I am uploading text data from a form. This data is either directly input into a
    textarea, or is a file stream originating from a .txt file, (or other basic text
    file (like off Mac or Unix - of course I don't necessarily know at present it's
    only .txt)

    The page encoding is :-
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

    On arrival at the server the content encoding is, sure enough UTF8.

    Data input via the textarea and input to a string is displayed in the debugger
    as pounds (£)

    Data input as a filestream has in the stream single bytes containing 0xA3 for
    the GBP pound sign.

    I process the input stream like this :-

    public static string StreamToString(Stream aStream)
    { {
    aStream.Position = 0;
    long i = aStream.Length;
    byte[] buffer = new byte;

    aStream.Read(buffer,0,(int)aStream.Length);
    return BytesToUTF8String(buffer);
    }

    public static string BytesToUTF8String(byte[] Array)
    {
    Encoding utf8 = Encoding.UTF8;
    char[] utf8Chars = new char[utf8.GetCharCount(Array, 0,Array.Length)];
    utf8.GetChars(Array, 0, Array.Length, utf8Chars, 0);

    return new string(utf8Chars);
    }

    The resulting string contains nothing ...

    If I use ASCII instead of UTF8, I get sense except my GBP signs are query ?
    marks.

    If I use UTF7 I get an apparently OK decoding.

    I am dubious about using UTF7 for no better reason than that it works. Is there
    logic here? What should I be doing?

    Thanks,
    Jim
     
    Jim Lawton, Jan 11, 2005
    #1
    1. Advertising

  2. Jim Lawton

    bruce barker Guest

    it doesn't really matter what encoding you use for the page response, whats
    important is the encoding used on the post from the browser. the browser
    picks this (though often it will match). you should check the content-type
    header the browser sends to determine the character set. for a html form
    post (application/x-www-form-urlencoded) IS0-8859-1 is the default character
    set not utf8.

    -- bruce (sqlwork.com)


    "Jim Lawton" <> wrote in message
    news:...
    | Hi,
    |
    | .net c# httphandler straight html form at browser.
    |
    | GBP pound sign problem (I know I know - I *can* decode it, but I've got to
    | understand what and why I should be doing stuff)
    |
    | I am uploading text data from a form. This data is either directly input
    into a
    | textarea, or is a file stream originating from a .txt file, (or other
    basic text
    | file (like off Mac or Unix - of course I don't necessarily know at present
    it's
    | only .txt)
    |
    | The page encoding is :-
    | <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    |
    | On arrival at the server the content encoding is, sure enough UTF8.
    |
    | Data input via the textarea and input to a string is displayed in the
    debugger
    | as pounds (£)
    |
    | Data input as a filestream has in the stream single bytes containing 0xA3
    for
    | the GBP pound sign.
    |
    | I process the input stream like this :-
    |
    | public static string StreamToString(Stream aStream)
    | { {
    | aStream.Position = 0;
    | long i = aStream.Length;
    | byte[] buffer = new byte;
    |
    | aStream.Read(buffer,0,(int)aStream.Length);
    | return BytesToUTF8String(buffer);
    | }
    |
    | public static string BytesToUTF8String(byte[] Array)
    | {
    | Encoding utf8 = Encoding.UTF8;
    | char[] utf8Chars = new char[utf8.GetCharCount(Array, 0,Array.Length)];
    | utf8.GetChars(Array, 0, Array.Length, utf8Chars, 0);
    |
    | return new string(utf8Chars);
    | }
    |
    | The resulting string contains nothing ...
    |
    | If I use ASCII instead of UTF8, I get sense except my GBP signs are query
    ?
    | marks.
    |
    | If I use UTF7 I get an apparently OK decoding.
    |
    | I am dubious about using UTF7 for no better reason than that it works. Is
    there
    | logic here? What should I be doing?
    |
    | Thanks,
    | Jim
     
    bruce barker, Jan 11, 2005
    #2
    1. Advertising

  3. Jim Lawton

    Jim Lawton Guest

    On Tue, 11 Jan 2005 10:03:03 -0800, "bruce barker" <>
    wrote:

    >it doesn't really matter what encoding you use for the page response, whats
    >important is the encoding used on the post from the browser. the browser
    >picks this (though often it will match). you should check the content-type
    >header the browser sends to determine the character set. for a html form
    >post (application/x-www-form-urlencoded) IS0-8859-1 is the default character
    >set not utf8.
    >
    >-- bruce (sqlwork.com)


    Thanks Bruce,

    for anyone googling this topic in future, there's more in
    dotnet.languages.csharp
    Message-ID: <>

    cheers Jim
     
    Jim Lawton, Jan 12, 2005
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Slade

    Problem encoding/decoding image

    Slade, Jun 25, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    1,127
    Natty Gur
    Jun 25, 2003
  2. Hardy Wang

    Encoding.Default and Encoding.UTF8

    Hardy Wang, Jun 8, 2004, in forum: ASP .Net
    Replies:
    5
    Views:
    18,932
    Jon Skeet [C# MVP]
    Jun 9, 2004
  3. Replies:
    1
    Views:
    23,447
    Real Gagnon
    Oct 8, 2004
  4. Greg
    Replies:
    9
    Views:
    829
    Nobody
    Oct 8, 2011
  5. Replies:
    2
    Views:
    389
Loading...

Share This Page