python+libxml2+scrapy AttributeError: 'module' object has noattribute 'HTML_PARSE_RECOVER'

D

Dmitry Arsentiev

Hello.

Has anybody already meet the problem like this? -
AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'

When I run scrapy, I get

File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py",
line 14, in <module>
libxml2.HTML_PARSE_NOERROR + \
AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'


When I run
python -c 'import libxml2; libxml2.HTML_PARSE_RECOVER'

I get
Traceback (most recent call last):
File "<string>", line 1, in <module>
AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'

How can I cure it?

Python 2.7
libxml2-python 2.6.9
2.6.11-gentoo-r6


I will be grateful for any help.

DETAILS:

scrapy crawl lgz -o items.json -t json
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 4, in <module>
execute()
File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 112, in execute
cmds = _get_commands_dict(inproject)
File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 37, in _get_commands_dict
cmds = _get_commands_from_module('scrapy.commands', inproject)
File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 30, in _get_commands_from_module
for cmd in _iter_command_classes(module):
File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 21, in _iter_command_classes
for module in walk_modules(module_name):
File "/usr/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 65, in walk_modules
submod = __import__(fullpath, {}, {}, [''])
File "/usr/local/lib/python2.7/site-packages/scrapy/commands/shell.py", line 8, in <module>
from scrapy.shell import Shell
File "/usr/local/lib/python2.7/site-packages/scrapy/shell.py", line 14, in <module>
from scrapy.selector import XPathSelector, XmlXPathSelector, HtmlXPathSelector
File "/usr/local/lib/python2.7/site-packages/scrapy/selector/__init__.py", line 30, in <module>
from scrapy.selector.libxml2sel import *
File "/usr/local/lib/python2.7/site-packages/scrapy/selector/libxml2sel.py", line 12, in <module>
from .factories import xmlDoc_from_html, xmlDoc_from_xml
File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py", line 14, in <module>
libxml2.HTML_PARSE_NOERROR + \
AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
 
D

Dieter Maurer

Dmitry Arsentiev said:
Has anybody already meet the problem like this? -
AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'

When I run scrapy, I get

File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py",
line 14, in <module>
libxml2.HTML_PARSE_NOERROR + \
AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'

Apparently, the versions of "scrapy" and "libxml2" do not fit.

Check with which "libxml2" versions, your "scrapy" version can work
and then install one of them.
 
P

personificator

I believe ftp://xmlsoft.org/libxml2/libxml2-2.8.0.tar.gz was what your looking for. Submit a ticket for the docs to get updated if your feeling generous.
 
S

Stefan Behnel

Dmitry Arsentiev, 15.08.2012 14:49:
Has anybody already meet the problem like this? -
AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'

When I run scrapy, I get

File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py",
line 14, in <module>
libxml2.HTML_PARSE_NOERROR + \
AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'


When I run
python -c 'import libxml2; libxml2.HTML_PARSE_RECOVER'

I get
Traceback (most recent call last):
File "<string>", line 1, in <module>
AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'

How can I cure it?

Python 2.7
libxml2-python 2.6.9
2.6.11-gentoo-r6

That version of libxml2 is way too old and doesn't support parsing
real-world HTML. IIRC, that started with 2.6.21 and got improved a bit
after that.

Get a 2.8.0 installation, as someone pointed out already.

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top