Keeping whitespace in responseText, etc.

Discussion in 'Javascript' started by e271828, Aug 24, 2006.

  1. e271828

    e271828 Guest

    I'm trying to access the source of an HTML page with as few alterations
    from the actual source (as in, that seen from the View Source option)
    as I can. The method document.documentElement.innerHTML returns the
    HTML source, but adds HEAD and other elements if they are absent from
    the source, and takes out whitespace (i.e., line feeds, carriage
    returns and tabs) within tags and between tags. The follow function:

    function xhr() {

    xhr = new XMLHttpRequest()
    xhr.open("GET","test-page.html",true);
    xhr.onreadystatechange = function() {
    if (xhr.readyState==4) {
    alert(xhr.responseText);
    }
    }
    xhr.send(null)
    }

    doesn't add or alter any tags that are absent in the source, and does
    not take out line feeds within tags; it does, however, still take out
    all non-line-feed whitespace within tags and all whitespace in general
    between tags.

    It seems that preserving whitespace is all that I need, but I haven't
    found a way to do that through my searches. So is there any way to get
    the unaltered HTML source of a page without innerHTML or applets, like
    a better version of the XMLHttpRequest object's responseText method?

    Thanks,
    Eric
    e271828, Aug 24, 2006
    #1
    1. Advertising

  2. e271828 wrote:


    > alert(xhr.responseText);



    > doesn't add or alter any tags that are absent in the source, and does
    > not take out line feeds within tags; it does, however, still take out
    > all non-line-feed whitespace within tags and all whitespace in general
    > between tags.


    responseText gives you the text as the browser decodes it from the HTTP
    response body. There might be issues with responseText with properly
    decoding characters depending on the encoding of the response but I
    don't think that the white space stripping occurs that you claim above.

    I suspect rather that you use Mozilla respectively Firefox and that the
    white space issue you notice is simply the somehow broken alert dialog
    in Mozilla where lots of white space is collapsed and not rendered.
    For example if you do e.g.
    alert(['Line 1', ' Line 2', 'Line 3'].join('\r\n'))
    with Mozilla then the alert dialog will not show the white space at the
    beginning of Line 2 at all.

    Is that Mozilla you are using? Then I think the issue you see is simply
    alerting the responseText and not white space missing in responseText.

    Or which browser do you have where you think that white space gets lost
    when using responseText?

    --

    Martin Honnen
    http://JavaScript.FAQTs.com/
    Martin Honnen, Aug 24, 2006
    #2
    1. Advertising

  3. e271828

    e271828 Guest

    You were right, more or less. Unlike innerHTML, responseText doesn't
    alter the HTML it gets; but when shown in an alert box it can seem like
    responseText mangles whitespace . When you try
    responseText.split("\t") or .split("\n") in a for loop for as many
    results those methods return, however, you will see that the number of
    the last alert plus 1 equals however many tabs or new lines you have in
    your actual source (unlike innerHTML).


    --


    But now I've encountered another problem I haven't been able to find an
    answer to: how do I get external URL's (say, http://www.google.com) to
    open in the XMLHttpRequest, instead of just local files?


    Martin Honnen wrote:
    > e271828 wrote:
    >
    >
    > > alert(xhr.responseText);

    >
    >
    > > doesn't add or alter any tags that are absent in the source, and does
    > > not take out line feeds within tags; it does, however, still take out
    > > all non-line-feed whitespace within tags and all whitespace in general
    > > between tags.

    >
    > responseText gives you the text as the browser decodes it from the HTTP
    > response body. There might be issues with responseText with properly
    > decoding characters depending on the encoding of the response but I
    > don't think that the white space stripping occurs that you claim above.
    >
    > I suspect rather that you use Mozilla respectively Firefox and that the
    > white space issue you notice is simply the somehow broken alert dialog
    > in Mozilla where lots of white space is collapsed and not rendered.
    > For example if you do e.g.
    > alert(['Line 1', ' Line 2', 'Line 3'].join('\r\n'))
    > with Mozilla then the alert dialog will not show the white space at the
    > beginning of Line 2 at all.
    >
    > Is that Mozilla you are using? Then I think the issue you see is simply
    > alerting the responseText and not white space missing in responseText.
    >
    > Or which browser do you have where you think that white space gets lost
    > when using responseText?
    >
    > --
    >
    > Martin Honnen
    > http://JavaScript.FAQTs.com/
    e271828, Aug 24, 2006
    #3
  4. e271828 wrote:


    > But now I've encountered another problem I haven't been able to find an
    > answer to: how do I get external URL's (say, http://www.google.com) to
    > open in the XMLHttpRequest, instead of just local files?


    Inside the normal browser context the same origin policy applies for
    request XMLHttpRequest makes thus you can only successfully make
    requests back to the server your document with the script comes from.
    So with client-side script you would need to have your own server side
    script function as a proxy to fetch URLs from e.g. www.google.com.

    Outside of the browser sandbox (e.g. on Windows in a HTA (HTML
    application) or with a Windows Script Host script or an ASP page or with
    the Mozilla browser if you write an extension) those restrictions do not
    apply.

    IE also has a zone model with different security zones which can be
    configured separatedly where for the trusted zone for instance you could
    change the settings to allow the request to other domains.


    --

    Martin Honnen
    http://JavaScript.FAQTs.com/
    Martin Honnen, Aug 25, 2006
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Oli Filth
    Replies:
    9
    Views:
    3,314
    Uncle Pirate
    Jan 17, 2005
  2. Kevin Walzer

    Re: PIL (etc etc etc) on OS X

    Kevin Walzer, Aug 1, 2008, in forum: Python
    Replies:
    4
    Views:
    369
    Fredrik Lundh
    Aug 13, 2008
  3. Replies:
    10
    Views:
    717
    Eric Brunel
    Dec 16, 2008
  4. MRAB
    Replies:
    3
    Views:
    368
  5. fochie
    Replies:
    9
    Views:
    277
Loading...

Share This Page