Regular expression to identify HTMLEncoded string

Discussion in 'ASP General' started by Gabriela, Nov 3, 2008.

  1. Gabriela

    Gabriela Guest

    Hi,
    I need help with writing a regexp that identifies HTML encoded
    strings.
    The problem occurred because I have a field in the DB, that contains
    regular ASCII chars, as well as HTMLencoded strings (e.g.:
    זאת לא).
    Is there a quick way to determine which strings are HTML encoded?
    Thanks,
    Gabi.
    Gabriela, Nov 3, 2008
    #1
    1. Advertising

  2. Gabriela

    Evertjan. Guest

    Gabriela wrote on 03 nov 2008 in microsoft.public.inetserver.asp.general:

    > Hi,
    > I need help with writing a regexp that identifies HTML encoded
    > strings.
    > The problem occurred because I have a field in the DB, that contains
    > regular ASCII chars, as well as HTMLencoded strings (e.g.:
    > זאת לא).


    These all look to me like regular ASCII chars,
    as there are no irregular ASCII chars.

    > Is there a quick way to determine which strings are HTML encoded?


    var bolResult = /\&\d{4};/.test(str)

    perhaps?

    bd way, a javascript string is in unicode, and can contain non-ASCII chars.

    --
    Evertjan.
    The Netherlands.
    (Please change the x'es to dots in my emailaddress)
    Evertjan., Nov 3, 2008
    #2
    1. Advertising

  3. "Gabriela" <> wrote in message
    news:...
    > Hi,
    > I need help with writing a regexp that identifies HTML encoded
    > strings.
    > The problem occurred because I have a field in the DB, that contains
    > regular ASCII chars, as well as HTMLencoded strings (e.g.:
    > זאת לא).
    > Is there a quick way to determine which strings are HTML encoded?


    Are you sure their not all HTML encoded? (That is, are there any that
    contain characters that would normally be encoded but have not been?).
    Do you know how they came to have this encoding?
    Are there any HTML specific entities such as &nbsp; or are they from the
    simple XML set.
    What is the DB fields data type?

    Why do you want to detect, is it because you want to convert the string
    back?

    If there are no HTML specific entities and its true that there are no values
    where character that would normally be encoded aren't, then:-

    Dim oXML : Set oXML = CreateObject("MSXML2.DOMDocument.3.0")
    oXML.LoadXML "<root>" & sFieldValue & "</root>"

    sDecoded = oXML.documentElement.text

    --
    Anthony Jones - MVP ASP/ASP.NET
    Anthony Jones, Nov 4, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,272
  2. Replies:
    9
    Views:
    581
    James Kanze
    Sep 19, 2007
  3. =?Utf-8?B?SnVsaWU=?=

    HtmlEncoded Usernames in ASP.NET 2 Membership DB?

    =?Utf-8?B?SnVsaWU=?=, Oct 9, 2007, in forum: ASP .Net
    Replies:
    0
    Views:
    464
    =?Utf-8?B?SnVsaWU=?=
    Oct 9, 2007
  4. Skinnerfritz
    Replies:
    5
    Views:
    342
  5. Alf McLaughlin
    Replies:
    9
    Views:
    143
    Alf McLaughlin
    Feb 10, 2006
Loading...

Share This Page