How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console oridle versions.

Discussion in 'Python' started by Simon Evans, May 10, 2014.

  1. Simon Evans

    Simon Evans Guest

    I am new to Python, but my main interest is to use it to Webscrape. I have downloaded Beautiful Soup, and have followed the instruction in the 'Getting Started with Beautiful Soup' book, but my Python installations keep returning errors, so I can't get started. I have unzipped Beautiful Soup to a folder of the same name on my C drive, in accordance with the first two stepsof page 12 of the aforementioned publication, but proceeding to navigate to the program as in step three, re: "Open up the command line prompt and navigate to the folder where you have unzipped the folder as follows:
    cd Beautiful Soup
    python setup python install "

    This returns on my Python 27 :File "<stdin>",line 1
    cd Beautiful Soup
    SyntaxError: invalid syntax
    also I get:
    to my IDLE Python 2.7 version, same goes for the Python 3.4 installations.
    Hope someone can help.
    Thanks in advance.
    Simon Evans, May 10, 2014
  2. This would be the operating system command line, not Python's
    interactive mode. Since you refer to a C drive, I'm going to assume
    Windows; you'll want to open up "Command Prompt", or cmd.exe, or
    whatever name your version of Windows buries it under. (Microsoft does
    not make it particularly easy on you.) Since you have a space in the
    name, you'll need quotes:

    cd "c:\Beautiful Soup"

    Then proceed as per the instructions.

    Chris Angelico, May 10, 2014
  3. Simon Evans

    Terry Reedy Guest

    On the All Programs / Start menu, look under Accessories. I have it
    pinned to my Win 7 task bar.
    Not for Win 7, at least

    C:\Users\Terry>cd \program files

    C:\Program Files>
    Terry Reedy, May 10, 2014
  4. Huh, good to know.

    Unfortunately, Windows leaves command-line parsing completely up to
    the individual command/application, so some will need quotes, some
    won't, and some will actually do very different things if you put an
    argument in quotes (look at START and FIND). There is a broad
    convention that spaces in file names get protected with quotes, though
    (for instance, tab completion will put quotes around them), so it's
    not complete chaos.

    Chris Angelico, May 11, 2014
  5. Simon Evans

    Dave Angel Guest

    "Complete chaos" is a pretty good description, especially since MS
    decided to make the default directory paths for many things have
    embedded spaces in them. And to change the rules from version to
    version of the OS. And it's not just the cmd line that's inconsistent;
    some exec function variants liberally parse unquoted names looking for
    some file that happens to match the first few 'words" of the string.

    I once debugged a customer problem (without actually seeing the
    machine), and told tech support to ask him if he had a file in the root
    directory called "program.exe." I turned out to be right, and the
    customer was sure I must have hacked into his machine.

    There was a bug in our code (missing quotes), masked by the liberality
    of the function I mentioned, that wasn't visible till such a file existed.

    The customer symptom? Our code complained that the linker couldn't be
    Dave Angel, May 11, 2014
  6. Simon Evans

    Simon Evans Guest

    Thank you everyone who replied, for your help. Using the command prompt console, it accepts the first line of code, but doesn't seem to accept the second line. I have altered it a little, but it is not having any of it, I quote my console input and output here, as it can probably explain things better than I :-

    Microsoft Windows [Version 6.1.7601]
    Copyright (c) 2009 Microsoft Corporation. All rights reserved.

    C:\Users\Intel Atom>cd"c:\Beautiful Soup"
    The filename, directory name, or volume label syntax is incorrect.

    C:\Users\Intel Atom>cd "c:\Beautiful Soup"

    c:\Beautiful Soup>python install.
    File "", line 22
    print "Unit tests have failed!"
    SyntaxError: invalid syntax

    c:\Beautiful Soup>python install"
    File "", line 22
    print "Unit tests have failed!"
    SyntaxError: invalid syntax

    c:\Beautiful Soup>
    I have tried writing "python install"
    ie putting the statement in inverted commas, but the console still seems toreject it re:-
    c:\Beautiful Soup>"python setup. py install"
    '"python setup. py install"' is not recognized as an internal or external comman
    operable program or batch file.

    c:\Beautiful Soup>
    Simon Evans, May 11, 2014
  7. Thank you. This sort of transcript does make it very easy to see
    what's going on!
    Command line syntax is always to put a command first (one word), and
    then its arguments (zero, one, or more words). You put quotes around a
    logical word when it has spaces in it. So, for instance, "foo bar" is
    one logical word. In this case, you omitted the space between the
    command and its argument, so Windows couldn't handle it. [1]
    And this is correct; you put quotes around the argument, and execute
    the "cd" command with an argument of "c:\Beautiful Soup". It then
    works, as is shown by the change of prompt in your subsequent lines.
    This indicates that you've installed a 3.x version of Python as the
    default, and is expecting a 2.x Python. Do you have multiple
    Pythons installed? Try typing this:

    c:\Python27\python install

    (That will work only if you have Python 2.7 installed into the default

    Hope that helps!

    [1] Windows lets you be a bit sloppy; for instance, cd\ works without
    a space between the command and the argument. (AFAIK this is true if
    and only if the path name starts with a backslash.) But normally, you
    separate command and argument(s) with a space.
    Chris Angelico, May 11, 2014
  8. Simon Evans

    Simon Evans Guest

    Dear Chris Angelico,
    Yes, you are right, I did install Python 3.4 as well as 2.7. I have removedPython 3.4, and input the code you suggested and it looks like it has installed properly, returning the following code:-
    Microsoft Windows [Version 6.1.7601]
    Copyright (c) 2009 Microsoft Corporation. All rights reserved.

    C:\Users\Intel Atom>cd "c:\Beautiful Soup"

    c:\Beautiful Soup>c:\Python27\python install
    running install
    running build
    running build_py
    creating build
    creating build\lib
    copying -> build\lib
    copying -> build\lib
    running install_lib
    copying build\lib\ -> c:\Python27\Lib\site-packages
    copying build\lib\ -> c:\Python27\Lib\site-packages
    byte-compiling c:\Python27\Lib\site-packages\ to BeautifulSoup.p
    byte-compiling c:\Python27\Lib\site-packages\ to BeautifulS
    running install_egg_info
    Writing c:\Python27\Lib\site-packages\BeautifulSoup-3.2.1-py2.7.egg-info

    c:\Beautiful Soup>
    Simon Evans, May 11, 2014
  9. Simon Evans

    MRAB Guest

    You didn't need to remove Python 3.4.

    When you typed:

    python install

    it defaulted to Python 3.4, presumably because that was the last one
    you installed.

    You just needed to be explicit instead:

    C:\Python27\python.exe install
    MRAB, May 11, 2014
  10. Simon Evans

    Terry Reedy Guest

    There is no need for a standalone Beautiful Soup directory. See below.
    Please do not advise people to unnecessarily downgrade to 2.7 ;-).
    Simon just needs the proper current version of BeautifulSoup.
    BeautifulSoup3 does not work with 3.x.
    BeautifulSoup4 works with 2.6+ and 3.x.
    Installation (of the latest version on PyPI) is trivial with 3.4:

    C:\Programs\Python34>pip install beautifulsoup4
    Downloading/unpacking beautifulsoup4
    egg_info for package

    Installing collected packages: beautifulsoup4
    Running install for beautifulsoup4
    Skipping implicit fixer: buffer
    Skipping implicit fixer: idioms
    Skipping implicit fixer: set_literal
    Skipping implicit fixer: ws_comma

    Successfully installed beautifulsoup4
    Cleaning up...

    Adding the '4' is necessary as
    package and that fails with the SyntaxError message Simon got.

    With '4', there is now an entry in lib/site-packages you are ready to go.

    Python 3.4.0 (v3.4.0:04f714765c13, Mar 16 2014, 19:25:23) [MSC v.1600 64
    bit (AMD64)] on win32
    <class 'bs4.BeautifulSoup'>
    Terry Reedy, May 11, 2014
  11. Simon Evans

    Simon Evans Guest

    I have downloaded Beautiful Soup 3, I am using Python 2.7. I understand from your message that I ought to use Python 2.6 or Python 3.4 with Beautiful Soup 4, the book I am using 'Getting Started with Beautiful Soup' is for Beautiful Soup 4. Therefore I gather I must re-download Beautiful Soup and get the 4 version, dispose of my Python 2.7 and reinstall Python 3.4. I am sure I can do this, but doesn't the above information suggest that the only Python grade left that might work with Beautiful Soup 3 would by Python 2.7 - which is the configuration I have at present, though I am not perfectly happy, as it is not taking code in the book (meant for BS4) such as the following on page 16 :

    helloworld = "<p>Hello World</p>"

    Simon Evans, May 11, 2014
  12. Simon Evans

    MRAB Guest

    That's the Windows command prompt, not the Python command prompt.
    MRAB, May 11, 2014
  13. Simon Evans

    Simon Evans Guest

    Yeah well at no point does the book say to start inputting the code mentioned in Python command prompt rather than the Windows command prompt, but thank you for your guidance anyway.
    I have downloaded the latest version of Beautiful Soup 4, but am again facing problems with the second line of code, re:-
    Microsoft Windows [Version 6.1.7601]
    Copyright (c) 2009 Microsoft Corporation. All rights reserved.

    C:\Users\Intel Atom>cd "c:\Beautiful Soup"

    c:\Beautiful Soup>c:\Python27\python install
    c:\Python27\python: can't open file '': [Errno 2] No such file or direct
    Simon Evans, May 12, 2014
  14. Simon Evans

    Simon Evans Guest

    Oh I think I see - I should be using Python 3.4 now, with BS4 ?
    Simon Evans, May 12, 2014
  15. Simon Evans

    Simon Evans Guest

    - but wait a moment 'BeautifulSoup4 works with 2.6+ and 3.x'(Terry Reedy) - doesn't 2.6 + = 2.7, which is what I'm using with BeautifulSoup4.
    Simon Evans, May 12, 2014
  16. Simon Evans

    Ian Kelly Guest

    The error message is telling you that the file that you're
    trying to run is missing. That would seem to indicate that Beautiful
    Soup hasn't been downloaded or unzipped correctly. What do you have
    in the Beautiful Soup directory?

    Also, use Python 3.4 as Terry Reedy suggested, unless the book is
    using 2.7 in which case you should probably use the same version as
    the book.
    Ian Kelly, May 12, 2014
  17. Simon Evans

    Terry Reedy Guest

    I wrote "BeautifulSoup4 works with 2.6+ and 3.x.".
    '2.6+' means 2.6 or 2.7. '3.x' should mean 3.1 to 3.4 but the range
    might start later. It does not matter because you should download and
    use 3.4 unless you *really* need to use something earlier. But also note
    that Windows has no problem with multiple version of python installed in
    different pythonxy directories.

    One of the things 3.4 does for you is make sure that pip is installed.
    It is now the more or less 'official' python package installer. To
    install BS4, do what the authors recommend on their web page
    and what I did: 'pip install beautifulsoup4' in a python34 directory. It
    took me less than a minute, far less that it took you to report that
    doing something else did not work.
    Terry Reedy, May 12, 2014
  18. Simon Evans

    Ian Kelly Guest

    Following up on that, if this is the book you are using:

    then it says to use Python 2.7.5 or greater. There is no indication
    that the book is targeted at Python 3, and in fact I see at least one
    line that won't work in Python 3 ("import urllib2"), so I definitely
    recommend sticking with a 2.7 release.
    Ian Kelly, May 12, 2014
  19. Oh, I'm glad of that! But without digging into the details of BS, all
    I could say for sure was that was expecting 2.x. :)

    Sticking with 3.4 and upgrading to BS4 is a much better solution.

    Chris Angelico, May 12, 2014
  20. Simon Evans

    Rustom Mody Guest

    I guess you've moved on from this specific problem.
    However here is some general advice:

    To use beautiful soup you need to use python.
    To use python you need to know python.
    Some people spend months on that, or weeks or days.

    Maybe you are clever and can reduce that to hours but not further :)

    So start with this

    [depending on which python you need]

    It may take a bit longer; but you will suffer less.
    Rustom Mody, May 12, 2014
