decode_entities possible bug?

Discussion in 'Perl' started by Richard Bell, May 29, 2004.

  1. Richard Bell

    Richard Bell Guest

    decode_entities does not appear to decode this text

    <span class="linksep1">•</span>

    The sequence &#8226 is left untouched.

    Is this correct/expected behavior, a bug, or what?

    Thanks for any help.

    R
     
    Richard Bell, May 29, 2004
    #1
    1. Advertising

  2. Richard Bell

    Richard Bell Guest

    Bob Walton wrote:
    > Richard Bell wrote:
    >
    >> decode_entities does not appear to decode this text
    >>
    >> <span class="linksep1">•</span>
    >>
    >> The sequence &#8226 is left untouched.
    >>
    >> Is this correct/expected behavior, a bug, or what?
    >>
    >> Thanks for any help.
    >>
    >> R

    >
    >
    > Perhaps you could clarify what "decode_entities" is? Is it some sub or
    > module you wrote, or part of a CPAN module? If the latter, which one of
    > the 6000+ modules is it a method of? Thanks.
    >


    Apologies, I should have been clearer. HTML::Entities.

    R
     
    Richard Bell, May 31, 2004
    #2
    1. Advertising

  3. Richard Bell

    Richard Bell Guest

    Bob Walton wrote:
    > Richard Bell wrote:
    >
    >> decode_entities does not appear to decode this text
    >>
    >> <span class="linksep1">•</span>
    >>
    >> The sequence &#8226 is left untouched.
    >>
    >> Is this correct/expected behavior, a bug, or what?
    >>
    >> Thanks for any help.
    >>
    >> R

    >
    >
    > Perhaps you could clarify what "decode_entities" is? Is it some sub or
    > module you wrote, or part of a CPAN module? If the latter, which one of
    > the 6000+ modules is it a method of? Thanks.
    >


    Apologies, I should have been clearer. HTML::Entities.

    R
     
    Richard Bell, May 31, 2004
    #3
  4. Richard Bell

    Richard Bell Guest

    Richard Bell wrote:
    > Bob Walton wrote:
    >
    >> Richard Bell wrote:
    >>
    >>> decode_entities does not appear to decode this text
    >>>
    >>> <span class="linksep1">•</span>
    >>>
    >>> The sequence &#8226 is left untouched.
    >>>
    >>> Is this correct/expected behavior, a bug, or what?
    >>>
    >>> Thanks for any help.
    >>>
    >>> R

    >>
    >>
    >>
    >> Perhaps you could clarify what "decode_entities" is? Is it some sub
    >> or module you wrote, or part of a CPAN module? If the latter, which
    >> one of the 6000+ modules is it a method of? Thanks.
    >>


    Apologies, I should have been clearer. HTML::Entities.

    R
     
    Richard Bell, May 31, 2004
    #4
  5. Richard Bell

    Richard Bell Guest

    Bill wrote:
    > Bob Walton wrote:
    >
    >> Richard Bell wrote:
    >>
    >>> decode_entities does not appear to decode this text
    >>>
    >>> <span class="linksep1">•</span>
    >>>
    >>> The sequence &#8226 is left untouched.
    >>>
    >>> Is this correct/expected behavior, a bug, or what?
    >>>
    >>> Thanks for any help.
    >>>
    >>> R

    >>
    >>
    >>
    >> Perhaps you could clarify what "decode_entities" is? Is it some sub
    >> or module you wrote, or part of a CPAN module? If the latter, which
    >> one of the 6000+ modules is it a method of? Thanks.
    >>

    > He's using HTML::Entities to decode unicode for a bullet, and it does
    > not seem to work well. Perl support for Unicode over Ā is still in
    > the works for some modules.
    >


    Thanks Bill. Is there another more appropriate choice?

    R
     
    Richard Bell, May 31, 2004
    #5
  6. Richard Bell

    Bill Guest

    >>>
    >> He's using HTML::Entities to decode unicode for a bullet, and it does
    >> not seem to work well. Perl support for Unicode over Ā is still
    >> in the works for some modules.
    >>

    >
    > Thanks Bill. Is there another more appropriate choice?


    Why should you have to decode this anyway? On my system, even decoded,
    it will not display correctly outside of the browser. Why not leave it
    as is?
     
    Bill, May 31, 2004
    #6
  7. Richard Bell

    Richard Bell Guest

    Bill wrote:
    >>>>
    >>> He's using HTML::Entities to decode unicode for a bullet, and it does
    >>> not seem to work well. Perl support for Unicode over Ā is still
    >>> in the works for some modules.
    >>>

    >>
    >> Thanks Bill. Is there another more appropriate choice?

    >
    >
    > Why should you have to decode this anyway? On my system, even decoded,
    > it will not display correctly outside of the browser. Why not leave it
    > as is?
    >

    Without going into overmuch detail, for my purposes (not display, but
    rather analysis of content) the undecoded characters get royally in the
    way. I assume something along the lines of s/&#([0..9]){1,4}/ / will
    turn them all into ' ' but had hoped for something a bit better as there
    is some useful semantics amongst the rubble.

    R
     
    Richard Bell, May 31, 2004
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Markus
    Replies:
    1
    Views:
    497
    Markus
    Nov 22, 2005
  2. Steve Holden
    Replies:
    1
    Views:
    415
    Behrang Dadsetan
    Jul 2, 2003
  3. Michael Fellinger
    Replies:
    3
    Views:
    181
    Michael Fellinger
    Dec 27, 2007
  4. Richard Bell

    decode_entities possible bug?

    Richard Bell, May 29, 2004, in forum: Perl Misc
    Replies:
    14
    Views:
    158
  5. Dave Saville

    Entities.pm - How does decode_entities work?

    Dave Saville, Dec 16, 2010, in forum: Perl Misc
    Replies:
    12
    Views:
    290
    Uri Guttman
    Dec 17, 2010
Loading...

Share This Page