Newbie: Simple problem with htm file.

Discussion in 'Perl Misc' started by John Smith, Jul 5, 2004.

  1. John Smith

    John Smith Guest

    Hello Perl guru's.

    I am having a problem reading a file that is a log file created by another
    program that is in html format.

    I have been using the open command to read and write to some text files and
    things are fine. When I open and read these htm log files Perl seems to be
    adding an extra space after each character. For example:
    < H e a d e r >
    T a p e L o g

    When I open the htm file with notepad the html code looks fine. What am I
    missing here? Perl seems to know this file is different than a standard
    text file and is adding all these spaces on it's own.

    Thanks for any assistance you can provide.
     
    John Smith, Jul 5, 2004
    #1
    1. Advertising

  2. In article <0P6Gc.27284$P7.12816@pd7tw3no>,
    John Smith <> wrote:
    :I am having a problem reading a file that is a log file created by another
    :program that is in html format.

    :I have been using the open command to read and write to some text files and
    :things are fine. When I open and read these htm log files Perl seems to be
    :adding an extra space after each character. For example:
    :< H e a d e r >
    :T a p e L o g

    Just a guess -- but I suspect the html file is utf8. See perldoc utf8
    --
    csh is bad drugs.
     
    Walter Roberson, Jul 5, 2004
    #2
    1. Advertising

  3. John Smith

    Joe Smith Guest

    Walter Roberson wrote:
    > In article <0P6Gc.27284$P7.12816@pd7tw3no>,
    > John Smith <> wrote:
    > :I am having a problem reading a file that is a log file created by another
    > :program that is in html format.
    >
    > :I have been using the open command to read and write to some text files and
    > :things are fine. When I open and read these htm log files Perl seems to be
    > :adding an extra space after each character. For example:
    > :< H e a d e r >
    > :T a p e L o g
    >
    > Just a guess -- but I suspect the html file is utf8. See perldoc utf8


    If there is a null byte between every printing character, then it is
    utf16, not utf8 (and not ASCII and not ISO-8859-1).
    -Joe
     
    Joe Smith, Jul 5, 2004
    #3
  4. John Smith <> wrote:


    > I am having a problem reading a file


    [snip ...]

    > that is in html format.



    > Thanks for any assistance you can provide.



    Have you seen the Perl FAQs about processing HTML?

    perldoc -q HTML


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jul 5, 2004
    #4
  5. Walter Roberson wrote:
    > In article <0P6Gc.27284$P7.12816@pd7tw3no>,
    > John Smith <> wrote:
    >> I have been using the open command to read and write to some text
    >> files and things are fine. When I open and read these htm log files
    >> Perl seems to be adding an extra space after each character. For
    >> example:

    > :< H e a d e r >
    >> T a p e L o g

    >
    > Just a guess -- but I suspect the html file is utf8. See perldoc utf8


    For English characters there is no binary difference whatsoever between
    UTF-8, ASCII, ISO-Latin-1, Windows-1252, etc, etc. That's one reason why
    programmers from English speaking countries typically are so ignorant about
    code pages. They just don't care because they don't need to care. Even if
    they write the code for ASCII and they receive UTF-8 data, it will still
    work for their English only test data.

    The symptons described by the OP are pointing more towards the direction of
    UTF-16.

    jue
     
    Jürgen Exner, Jul 5, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rob

    Create htm file from Web Form

    Rob, Aug 22, 2003, in forum: ASP .Net
    Replies:
    0
    Views:
    2,086
  2. Bruce W...1

    Can a .aspx file be renamed to .htm?

    Bruce W...1, Aug 23, 2003, in forum: ASP .Net
    Replies:
    11
    Views:
    683
    MS News \(MS ILM\)
    Aug 24, 2003
  3. Phil Di Guillielmo
    Replies:
    6
    Views:
    8,363
    Aaron J. Scott
    Aug 27, 2003
  4. Replies:
    6
    Views:
    170
    Dr.Ruud
    Feb 6, 2007
  5. Norman Swartz
    Replies:
    0
    Views:
    102
    Norman Swartz
    Oct 13, 2006
Loading...

Share This Page