python+libxml2+scrapy AttributeError: 'module' object has noattribute 'HTML_PARSE_RECOVER'

Discussion in 'Python' started by Dmitry Arsentiev, Aug 15, 2012.

  1. Hello.

    Has anybody already meet the problem like this? -
    AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'

    When I run scrapy, I get

    File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py",
    line 14, in <module>
    libxml2.HTML_PARSE_NOERROR + \
    AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'


    When I run
    python -c 'import libxml2; libxml2.HTML_PARSE_RECOVER'

    I get
    Traceback (most recent call last):
    File "<string>", line 1, in <module>
    AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'

    How can I cure it?

    Python 2.7
    libxml2-python 2.6.9
    2.6.11-gentoo-r6


    I will be grateful for any help.

    DETAILS:

    scrapy crawl lgz -o items.json -t json
    Traceback (most recent call last):
    File "/usr/local/bin/scrapy", line 4, in <module>
    execute()
    File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 112, in execute
    cmds = _get_commands_dict(inproject)
    File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 37, in _get_commands_dict
    cmds = _get_commands_from_module('scrapy.commands', inproject)
    File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 30, in _get_commands_from_module
    for cmd in _iter_command_classes(module):
    File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 21, in _iter_command_classes
    for module in walk_modules(module_name):
    File "/usr/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 65, in walk_modules
    submod = __import__(fullpath, {}, {}, [''])
    File "/usr/local/lib/python2.7/site-packages/scrapy/commands/shell.py", line 8, in <module>
    from scrapy.shell import Shell
    File "/usr/local/lib/python2.7/site-packages/scrapy/shell.py", line 14, in <module>
    from scrapy.selector import XPathSelector, XmlXPathSelector, HtmlXPathSelector
    File "/usr/local/lib/python2.7/site-packages/scrapy/selector/__init__.py", line 30, in <module>
    from scrapy.selector.libxml2sel import *
    File "/usr/local/lib/python2.7/site-packages/scrapy/selector/libxml2sel.py", line 12, in <module>
    from .factories import xmlDoc_from_html, xmlDoc_from_xml
    File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py", line 14, in <module>
    libxml2.HTML_PARSE_NOERROR + \
    AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
    Dmitry Arsentiev, Aug 15, 2012
    #1
    1. Advertising

  2. Re: python+libxml2+scrapy AttributeError: 'module' object has noattribute 'HTML_PARSE_RECOVER'

    Dmitry Arsentiev <> writes:

    > Has anybody already meet the problem like this? -
    > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
    >
    > When I run scrapy, I get
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py",
    > line 14, in <module>
    > libxml2.HTML_PARSE_NOERROR + \
    > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'


    Apparently, the versions of "scrapy" and "libxml2" do not fit.

    Check with which "libxml2" versions, your "scrapy" version can work
    and then install one of them.
    Dieter Maurer, Aug 16, 2012
    #2
    1. Advertising

  3. Dmitry Arsentiev

    Guest

    I believe ftp://xmlsoft.org/libxml2/libxml2-2.8.0.tar.gz was what your looking for. Submit a ticket for the docs to get updated if your feeling generous.

    On Wednesday, August 15, 2012 7:49:04 AM UTC-5, Dmitry Arsentiev wrote:
    > Hello.
    >
    >
    >
    > Has anybody already meet the problem like this? -
    >
    > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
    >
    >
    >
    > When I run scrapy, I get
    >
    >
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py",
    >
    > line 14, in <module>
    >
    > libxml2.HTML_PARSE_NOERROR + \
    >
    > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
    >
    >
    >
    >
    >
    > When I run
    >
    > python -c 'import libxml2; libxml2.HTML_PARSE_RECOVER'
    >
    >
    >
    > I get
    >
    > Traceback (most recent call last):
    >
    > File "<string>", line 1, in <module>
    >
    > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
    >
    >
    >
    > How can I cure it?
    >
    >
    >
    > Python 2.7
    >
    > libxml2-python 2.6.9
    >
    > 2.6.11-gentoo-r6
    >
    >
    >
    >
    >
    > I will be grateful for any help.
    >
    >
    >
    > DETAILS:
    >
    >
    >
    > scrapy crawl lgz -o items.json -t json
    >
    > Traceback (most recent call last):
    >
    > File "/usr/local/bin/scrapy", line 4, in <module>
    >
    > execute()
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 112, in execute
    >
    > cmds = _get_commands_dict(inproject)
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 37, in _get_commands_dict
    >
    > cmds = _get_commands_from_module('scrapy.commands', inproject)
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 30, in _get_commands_from_module
    >
    > for cmd in _iter_command_classes(module):
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 21, in _iter_command_classes
    >
    > for module in walk_modules(module_name):
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 65, in walk_modules
    >
    > submod = __import__(fullpath, {}, {}, [''])
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/commands/shell.py", line 8, in <module>
    >
    > from scrapy.shell import Shell
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/shell.py", line 14, in <module>
    >
    > from scrapy.selector import XPathSelector, XmlXPathSelector, HtmlXPathSelector
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/selector/__init__.py", line 30, in <module>
    >
    > from scrapy.selector.libxml2sel import *
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/selector/libxml2sel.py", line 12, in <module>
    >
    > from .factories import xmlDoc_from_html, xmlDoc_from_xml
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py", line 14, in <module>
    >
    > libxml2.HTML_PARSE_NOERROR + \
    >
    > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
    , Aug 17, 2012
    #3
  4. Dmitry Arsentiev, 15.08.2012 14:49:
    > Has anybody already meet the problem like this? -
    > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
    >
    > When I run scrapy, I get
    >
    > File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py",
    > line 14, in <module>
    > libxml2.HTML_PARSE_NOERROR + \
    > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
    >
    >
    > When I run
    > python -c 'import libxml2; libxml2.HTML_PARSE_RECOVER'
    >
    > I get
    > Traceback (most recent call last):
    > File "<string>", line 1, in <module>
    > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
    >
    > How can I cure it?
    >
    > Python 2.7
    > libxml2-python 2.6.9
    > 2.6.11-gentoo-r6


    That version of libxml2 is way too old and doesn't support parsing
    real-world HTML. IIRC, that started with 2.6.21 and got improved a bit
    after that.

    Get a 2.8.0 installation, as someone pointed out already.

    Stefan
    Stefan Behnel, Aug 18, 2012
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. dont bother
    Replies:
    3
    Views:
    4,162
    scriber
    Mar 31, 2009
  2. Matt Nordhoff
    Replies:
    0
    Views:
    446
    Matt Nordhoff
    Sep 9, 2008
  3. Tzury Bar Yochay
    Replies:
    5
    Views:
    2,329
    Vinay Sajip
    Jan 2, 2009
  4. JohannesTU
    Replies:
    0
    Views:
    260
    JohannesTU
    Feb 20, 2012
  5. alesssia
    Replies:
    3
    Views:
    105
    Dave Angel
    Aug 2, 2013
Loading...

Share This Page