R
R. David Murray
mattia said:Hi all, can you tell me why the module urllib.request (py3) add extra
characters (b'fef\r\n and \r\n0\r\n\r\n') in a simple example like the
following and urllib2 (py2.6) correctly not?
py2.6
... print(f, file=fd)
...
Opening the two html pages with ff I've got different results (the extra
characters mentioned earlier), why?
The problem isn't a difference between urllib2 and urllib.request, it
is between fd.write and print. This produces the same result as
your first example:
.... fd.write(f)
The "b'....'" is the stringified representation of a bytes object,
which is what urllib.request returns in python3. Note the 'wb',
which is a critical difference from the python2.6 case. If you
omit the 'b' in python3, it will complain that you can't write bytes
to the file object.
The thing to keep in mind is that print converts its argument to string
before writing it anywhere (that's the point of using it), and that
bytes (or buffer) and string are very different types in python3.