Special character to &abc equivalents

Discussion in 'ASP .Net' started by Colin Peters, May 7, 2005.

  1. Colin Peters

    Colin Peters Guest

    Hi,

    I'm reading a file and writing it to the html output for a page.

    I've come across two difficulties which I would like to solve.

    The files contain special characters from European alphabets, namely
    those which have the two little dots above the vowels called umlauts.

    Normally, these are rendered in html using "%auml;", but in the file
    they are just ä.

    1. I'm using a StreamReader to read the file and I have found that if I
    don't use System.Text.Encoding.UTF7 then the characters are lost
    completely. Is this the correct way, or is there a way to automatically
    get the Stream Reader to select the correct encoding, or use other code
    to determine which would be best?

    2. Having read the character from the file, it is output literally to
    the html, which I guess is to be expected. Is there a way to process a
    string in order to change the ä to &äuml; and so on.

    Thanks in advance for any replies.
     
    Colin Peters, May 7, 2005
    #1
    1. Advertisements

  2. My advice u set underlying operating system encoding whatever u want. And
    use streamreader and streamwriter with System.Text.Encoding.Default which
    uses underlying OS encoding.

    I had same problems with Turkish encoding but this is the best solution
    (IMHO)
     
    Yunus Emre ALPÖZEN [MCAD.NET], May 7, 2005
    #2
    1. Advertisements

  3. Colin Peters

    Joerg Jooss Guest

    UTF-7 is hardly what you want. Did you try ISO-8859-1? Or Windows-1252?
    In general, there's no way to guess a character encoding because
    there's no universal metadata that could tell you what encoding is
    being used.

    To put it differently: You must know the encoding, or allow the user to
    switch between possible encodings.

    That's not necessary if the page is encoded correctly.

    Cheers,
     
    Joerg Jooss, May 7, 2005
    #3
  4. Colin Peters

    Colin Peters Guest

    Unfortunately I'm using shared hosting. I have little influence over
    operating system parameters.

    Thanks anyway.
     
    Colin Peters, May 7, 2005
    #4
  5. Colin Peters

    Colin Peters Guest


    I didn't see this as an option provided by Intellisense for the class:
    System.Text.Encoding

    Thanks anyway.
     
    Colin Peters, May 7, 2005
    #5
  6. You can set the encoding as a Page directive.

    <%@Page Language="VB" ResponseEncoding="UTF-8"%>

    <%@Page Language="C#" ResponseEncoding="ISO-8859-1"%>
     
    Juan T. Llibre, May 7, 2005
    #6
  7. Colin Peters

    Joerg Jooss Guest

    There are only a few default instances in Encoding. You can construct
    all encodings by name using Encoding.GetEncoding(), e.g.

    Encoding enc = Encoding.GetEncoding("ISO-8859-1").

    Cheers,
     
    Joerg Jooss, May 7, 2005
    #7
  8. Colin Peters

    Colin Peters Guest

    Aha! The penny has dropped. Or in this case, the Euro.

    Many thanks to all.
     
    Colin Peters, May 7, 2005
    #8
  9. Colin Peters

    Guest Guest

    Server.HtmlEncode(string) will convert any "special chars" from a text file
    to the relevant &abc; equivalent without having to worry about codepages... I
    use it in my chat application to prevent malicious code being inserted into
    the database.

    Regards,

    Paul Parkinson (www.elysaria.com)
     
    Guest, May 9, 2005
    #9
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.