The method of insert doesn't work with nltk texts: AttributeError: 'ConcatenatedCorpusView' object

Discussion in 'Python' started by Token Type, Sep 2, 2012.

  1. Token Type

    Token Type Guest

    I wrote codes to add 'like' at the end of every 3 word in a nltk text as follows:

    >>> text = nltk.corpus.brown.words(categories = 'news')
    >>> def hedge(text):

    for i in range(3,len(text),4):
    new_text = text.insert(i, 'like')
    return new_text[:50]

    >>> hedge(text)


    Traceback (most recent call last):
    File "<pyshell#77>", line 1, in <module>
    hedge(text)
    File "<pyshell#76>", line 3, in hedge
    new_text = text.insert(i, 'like')
    AttributeError: 'ConcatenatedCorpusView' object has no attribute 'insert'

    Isn't text in the brown corpus above a list? why doesn't it has attribute 'insert'?

    Thanks much for your hints.
     
    Token Type, Sep 2, 2012
    #1
    1. Advertising

  2. Token Type

    Dave Angel Guest

    On 09/02/2012 05:39 AM, Token Type wrote:
    > I wrote codes to add 'like' at the end of every 3 word in a nltk text as follows:
    >
    >>>> text = nltk.corpus.brown.words(categories = 'news')
    >>>> def hedge(text):

    > for i in range(3,len(text),4):
    > new_text = text.insert(i, 'like')
    > return new_text[:50]
    >
    >>>> hedge(text)

    > Traceback (most recent call last):
    > File "<pyshell#77>", line 1, in <module>
    > hedge(text)
    > File "<pyshell#76>", line 3, in hedge
    > new_text = text.insert(i, 'like')
    > AttributeError: 'ConcatenatedCorpusView' object has no attribute 'insert'
    >
    > Isn't text in the brown corpus above a list? why doesn't it has attribute 'insert'?
    >

    I tried to find online documentation for nltk, and although I found the
    mention of a free online book, I didn't see it. So, some generic comments.

    The error message is telling you that the object 'text' is not a list,
    but a "ConcatenatedCorpusView". Perhaps you can look that up in your
    docs for nltk. But there's quite a bit you can do just with the
    interpreter.

    try print type(text) to see the type of text.

    try dir(text) to see what attributes it has

    try help(text) to see what docstrings might be built in.

    Incidentally, if you really think it's a list of words (or that it acts
    like a list), then 'text' might not be the best name for it. Any reason
    you didn't just call it words ?

    --

    DaveA
     
    Dave Angel, Sep 2, 2012
    #2
    1. Advertising

  3. Token Type

    Dave Angel Guest

    Re: Fwd: The method of insert doesn't work with nltk texts:

    On 09/02/2012 09:06 AM, John H. Li wrote:
    > First, thanks very much for your kind help.
    >
    > 1)Further more, I test the function of insert. It did work as follows:
    >
    >>>> text = ['The', 'Fulton', 'County', 'Grand']
    >>>> text.insert(3,'like')
    >>>> text

    > ['The', 'Fulton', 'County', 'like', 'Grand']
    > 2) I tested the text from nltk. It is list actually. See the following:
    >>>> text = nltk.corpus.brown.words(categories = 'news')
    >>>> text[:10]

    > ['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', 'Friday', 'an',
    > 'investigation', 'of']
    >
    > How come python tells me that it is not a list by prompting "AttributeError:
    > 'ConcatenatedCorpusView' object has no attribute 'insert'"? I am confused.
    >
    > Since we doubt text is not a list, I have to add one more line of code
    > there as follows. Then it seems working.
    >>>> text = nltk.corpus.brown.words(categories = 'news')
    >>>> def hedge(text):

    > text = list(text)
    > for i in range(3,len(text),4):
    > text.insert(i, 'like')
    > return text[:50]
    >
    >>>> hedge(text)

    > ['The', 'Fulton', 'County', 'like', 'Grand', 'Jury', 'said', 'like',
    > 'Friday', 'an', 'investigation', 'like', 'of', "Atlanta's", 'recent',
    > 'like', 'primary', 'election', 'produced', 'like', '``', 'no', 'evidence',
    > 'like', "''", 'that', 'any', 'like', 'irregularities', 'took', 'place',
    > 'like', '.', 'The', 'jury', 'like', 'further', 'said', 'in', 'like',
    > 'term-end', 'presentments', 'that', 'like', 'the', 'City', 'Executive',
    > 'like', 'Committee', ',']
    >
    > Isn't it odd?
    >
    >


    Without reading the documentation, or at least the help(), I can't
    figure it to be odd. If a class wants to support slicing semantics, all
    it has to do is implement special methods like __getslice__ and
    __setslice__. If it doesn't document .insert(), then you shouldn't try
    to call it. Duck-typing.

    What did you get when you tried type(), dir() and help() ? Did they help.

    --

    DaveA
     
    Dave Angel, Sep 2, 2012
    #3
  4. Token Type

    Peter Otten Guest

    Re: The method of insert doesn't work with nltktexts: AttributeError:'ConcatenatedCorpusView' object has no attribute 'insert'

    Token Type wrote:

    > I wrote codes to add 'like' at the end of every 3 word in a nltk text as

    follows:
    >
    > >>> text = nltk.corpus.brown.words(categories = 'news')
    > >>> def hedge(text):

    > for i in range(3,len(text),4):
    > new_text = text.insert(i, 'like')
    > return new_text[:50]
    >
    > >>> hedge(text)

    >
    > Traceback (most recent call last):
    > File "<pyshell#77>", line 1, in <module>
    > hedge(text)
    > File "<pyshell#76>", line 3, in hedge
    > new_text = text.insert(i, 'like')
    > AttributeError: 'ConcatenatedCorpusView' object has no attribute 'insert'
    >
    > Isn't text in the brown corpus above a list? why doesn't it has attribute

    'insert'?
    >
    > Thanks much for your hints.


    The error message shows that text is not a list. It looks like a list,

    >>> text = nltk.corpus.brown.words(categories="news")
    >>> text

    ['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', ...]

    but it is actually a nltk.corpus.reader.util.ConcatenatedCorpusView:

    >>> type(text)

    <class 'nltk.corpus.reader.util.ConcatenatedCorpusView'>

    The implementer of a class is free to decide what methods he wants to
    implement. You can get a first impression of the available ones with dir():

    >>> dir(text)

    ['_MAX_REPR_SIZE', '__add__', '__class__', '__cmp__', '__contains__',
    '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__',
    '__getitem__', '__hash__', '__init__', '__iter__', '__len__', '__module__',
    '__mul__', '__new__', '__radd__', '__reduce__', '__reduce_ex__', '__repr__',
    '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__',
    '__weakref__', '_offsets', '_open_piece', '_pieces', 'close', 'count',
    'index', 'iterate_from']

    As you can see insert() is not among these methods. However, __iter__() is a
    hint that you can convert the ConcatenatedCorpusView to a list, and that
    does provide an insert() method. Let's try:

    >>> text = list(text)
    >>> type(text)

    <type 'list'>
    >>> text.insert(0, "yadda")
    >>> text[:5]

    ['yadda', 'The', 'Fulton', 'County', 'Grand']

    Note that your hedge() function may still not work as you expect:

    >>> text = ["-"] * 20
    >>> text

    ['-', '-', '-', '-', '-', '-', '-', '-', '-', '-', '-', '-', '-', '-', '-',
    '-', '-', '-', '-', '-']
    >>> for i in range(0, len(text), 3):

    .... text.insert(i, "X")
    ....
    >>> text

    ['X', '-', '-', 'X', '-', '-', 'X', '-', '-', 'X', '-', '-', 'X', '-', '-',
    'X', '-', '-', 'X', '-', '-', '-', '-', '-', '-', '-', '-']

    That is because the list is growing with every insert() call. One workaround
    is to start inserting items at the end of the list:
    >>> text = ["-"] * 20
    >>> for i in reversed(range(0, len(text), 3)):

    .... text.insert(i, "X")
    ....
    >>> text

    ['X', '-', '-', '-', 'X', '-', '-', '-', 'X', '-', '-', '-', 'X', '-', '-',
    '-', 'X', '-', '-', '-', 'X', '-', '-', '-', 'X', '-', '-']
     
    Peter Otten, Sep 2, 2012
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. j_pennington_moore

    newbie NLTK question

    j_pennington_moore, Jul 20, 2004, in forum: Python
    Replies:
    0
    Views:
    338
    j_pennington_moore
    Jul 20, 2004
  2. Replies:
    1
    Views:
    309
    Fredrik Lundh
    Dec 22, 2004
  3. Tony Meyer
    Replies:
    6
    Views:
    1,472
  4. Passer By

    NLTK, Random Sentence Generators?

    Passer By, Apr 10, 2007, in forum: Python
    Replies:
    8
    Views:
    2,000
    Oleg Alexander
    Apr 12, 2007
  5. Steven Bird
    Replies:
    1
    Views:
    405
    tool69
    May 26, 2007
Loading...

Share This Page