Bugs: Content-Length not updated by reused urllib.request.Request/ has_header() case-sensitive

Discussion in 'Python' started by Johannes Kleese, Nov 12, 2012.

  1. Hi!

    (Yes, I did take a look at the issue tracker but couldn't find any
    corresponding bug, and no, I don't want to open a new account just for
    this one.)

    --------------------------------------------------------------------

    I'm reusing a single urllib.request.Request object to HTTP-POST data to
    the same URL a number of times. While the data itself is sent as
    expected every time, the Content-Length header is not updated after the
    first request. Tested with Python 3.1.3 and Python 3.1.4.
    {"Content-Type": "application/x-www-form-urlencoded"})

    [('Content-length', '1'), ('Content-type',
    'application/x-www-form-urlencoded'), ('Host', 'example.com'),
    ('User-agent', 'Python-urllib/3.1')]

    [('Content-length', '1'), ('Content-type',
    'application/x-www-form-urlencoded'), ('Host', 'example.com'),
    ('User-agent', 'Python-urllib/3.1')]

    Note that after the second run, Content-Length stays "1", but should be
    "9", corresponding to the data b'123456789'. (Request data is not
    x-www-form-urlencoded to shorten the test case. Doesn't affect the bug,
    though.)




    --------------------------------------------------------------------

    While at it, I noticed that urllib.request.Request.has_header() and
    ..get_header() are case-sensitive, while HTTP headers are not (RFC 2616,
    4.2). Thus the following, slightly unfortunate behaviour:
    [('Content-length', '1'), ('Content-type',
    'application/x-www-form-urlencoded'), ('Host', 'example.com'),
    ('User-agent', 'Python-urllib/3.1')]
    'application/x-www-form-urlencoded'
     
    Johannes Kleese, Nov 12, 2012
    #1
    1. Advertisements

  2. Johannes Kleese

    Terry Reedy Guest

    You only have to open a tracker account just once. I am reluctant to
    report this myself as I do not use the module and cannot answer questions.
    3.1 only gets security fixes. Consider upgrading. In any case, suspected
    bugs need to be tested with the latest release, as patches get applied
    daily. As it happens,

    import urllib.request
    opener = urllib.request.build_opener()
    request = urllib.request.Request("http://example.com/", headers =
    {"Content-Type": "application/x-www-form-urlencoded"})

    opener.open(request, "1".encode("us-ascii"))
    print(request.data, '\n', request.header_items())

    opener.open(request, "123456789".encode("us-ascii"))
    print(request.data, '\n', request.header_items())

    exhibits the same behavior in 3.3.0 of printing ('Content-length', '1')
    in the last output. I agree that that looks wrong, but I do not know if
    such re-use is supposed to be supported.

    Python is case sensitive.
    Judging from 'Content-type', 'User-agent', 'Content-length', 'Host',
    urllib.request consistently capitalizes the first word of all header
    tags and expects them in that form. If that is not standard, it should
    be documented.
     
    Terry Reedy, Nov 12, 2012
    #2
    1. Advertisements

  3. Johannes Kleese

    Terry Reedy Guest

    I opened http://bugs.python.org/issue16464
     
    Terry Reedy, Nov 13, 2012
    #3
  4. Stuck with Debian on a server, thus stuck with 3.1 on development machine.
    The Request object should then either get it right on re-use (which I'd
    prefer), or block re-use.
    True, of course, but

    and the functions work on HTTP data, not Python data. After all, we are
    lucky to have functions here and not just a dictionary.


    Anyway, thanks for reporting!
     
    Johannes Kleese, Nov 13, 2012
    #4
  5. Johannes Kleese

    Terry Reedy Guest

    A patch has been written by Alexey Kachayev and pushed by Andrew Svetlov
    and the behavior will change in 3.4.0 to allow reuse.
     
    Terry Reedy, Nov 27, 2012
    #5
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.