Trouble Counting Words, Sentences and Paragraphs

Discussion in 'Ruby' started by Max Norman, Jul 22, 2009.

  1. Max Norman

    Max Norman Guest

    I'm working on the first example application from 'Learning Ruby, from
    Novice to Professional,' a text analyzer that counts the number of
    characters (with and without spaces), lines, words, sentences and
    paragraphs in a text document. Unfortunately, I've run into trouble: the
    numbers don't seem to be coming out right.

    Attached is the text file, for testing purposes, and below is the
    source:

    lines = File.readlines("text.txt")
    line_count = lines.size
    text = lines.join

    puts "#{line_count} lines."

    total_characters = text.length
    puts "#{total_characters} characters."

    total_characters_nonspaces = text.gsub(/\s+/, '').length
    puts "#{total_characters_nonspaces} characters, excluding spaces."

    word_count = text.split.length
    puts "#{word_count} words."

    paragraph_count = text.split(/\n\n/).length
    puts "#{paragraph_count} paragraphs."

    sentence_count = text.split(/\.|\? |!/).length
    puts "#{sentence_count} sentences."

    --

    Here are the results I get:
    42 lines.
    6446 characters.
    5315 characters, excluding spaces.
    1130 words.
    2 paragraphs.
    44 sentences.

    Any and all help/advice would be appreciated.

    Attachments:
    http://www.ruby-forum.com/attachment/3890/text.txt

    --
    Posted via http://www.ruby-forum.com/.
    Max Norman, Jul 22, 2009
    #1
    1. Advertising

  2. Max Norman

    7stud -- Guest

    Max Norman wrote:
    > Unfortunately, I've run into trouble: the
    > numbers don't seem to be coming out right.
    >


    What are your suspicions?
    --
    Posted via http://www.ruby-forum.com/.
    7stud --, Jul 22, 2009
    #2
    1. Advertising

  3. Max Norman

    Max Norman Guest

    7stud -- wrote:
    > Max Norman wrote:
    >> Unfortunately, I've run into trouble: the
    >> numbers don't seem to be coming out right.
    >>

    >
    > What are your suspicions?


    My concern stems from the paragraph count: the application reports only
    two paragraphs, but the document is segmented into a score more.
    --
    Posted via http://www.ruby-forum.com/.
    Max Norman, Jul 22, 2009
    #3
  4. Max Norman

    7stud -- Guest

    Max Norman wrote:
    > 7stud -- wrote:
    >> Max Norman wrote:
    >>> Unfortunately, I've run into trouble: the
    >>> numbers don't seem to be coming out right.
    >>>

    >>
    >> What are your suspicions?

    >
    > My concern stems from the paragraph count: the application reports only
    > two paragraphs, but the document is segmented into a score more.


    The code defines a paragraph as two consecutive newlines, which would
    look like this:


    hello world.
    other text.

    goodbye world.
    other text.

    This is what your text file looks like to me:

    {\rtf1\ansi\ansicpg1252\cocoartf949\cocoasubrtf460
    {\fonttbl\f0\fnil\fcharset0 Verdana;}
    {\colortbl;\red255\green255\blue255;}
    \margl1440\margr1440\vieww9000\viewh8400\viewkind0 \deftab720
    \pard\pardeftab720\ql\qnatural \f0\fs24 \cf0 Among other public
    buildings in a certain town, which for many reasons it will be prudent
    to refrain from mentioning, and to which I will assign no fictitious
    name, there is one anciently common to most towns, great or small: to
    wit, a workhouse; and in this workhouse was born; on a day and date
    which I need not trouble myself to repeat, inasmuch as it can be of no
    possible consequence to the reader, in this stage of the business at all
    events; the item of mortality whose name is prefixed to the head of this
    chapter.\ \ For a long time after it was ushered into this world of
    sorrow and trouble, by the parish surgeon, it remained a matter of
    considerable doubt whether..
    --
    Posted via http://www.ruby-forum.com/.
    7stud --, Jul 22, 2009
    #4
  5. Max Norman

    Max Norman Guest

    This is what the text file should look like:
    Among other public buildings in a certain town, which for many reasons
    it will be prudent to refrain from mentioning, and to which I will
    assign no fictitious name, there is one anciently common to most towns,
    great or small: to wit, a workhouse; and in this workhouse was born; on
    a day and date which I need not trouble myself to repeat, inasmuch as it
    can be of no possible consequence to the reader, in this stage of the
    business at all events; the item of mortality whose name is prefixed to
    the head of this chapter.

    For a long time after it was ushered into this world of sorrow and
    trouble, by the parish surgeon, it remained a matter of considerable
    doubt whether the child would survive to bear any name at all; in which
    case it is somewhat more than probable that these memoirs would never
    have appeared; or, if they had, that being comprised within a couple of
    pages, they would have possessed the inestimable merit of being the most
    concise and faithful specimen of biography, extant in the literature of
    any age or country.

    Although I am not disposed to maintain that the being born in a
    workhouse, is in itself the most fortunate and enviable circumstance
    that can possibly befall a human being, I do mean to say that in this
    particular instance, it was the best thing for Oliver Twist that could
    by possibility have occurred. The fact is, that there was considerable
    difficulty in inducing Oliver to take upon himself the office of
    respiration,--a troublesome practice, but one which custom has rendered
    necessary to our easy existence; and for some time he lay gasping on a
    little flock mattress, rather unequally poised between this world and
    the next: the balance being decidedly in favour of the latter. Now, if,
    during this brief period, Oliver had been surrounded by careful
    grandmothers, anxious aunts, experienced nurses, and doctors of profound
    wisdom, he would most inevitably and indubitably have been killed in no
    time. There being nobody by, however, but a pauper old woman, who was
    rendered rather misty by an unwonted allowance of beer; and a parish
    surgeon who did such matters by contract; Oliver and Nature fought out
    the point between them. The result was, that, after a few struggles,
    Oliver breathed, sneezed, and proceeded to advertise to the inmates of
    the workhouse the fact of a new burden having been imposed upon the
    parish, by setting up as loud a cry as could reasonably have been
    expected from a male infant who had not been possessed of that very
    useful appendage, a voice, for a much longer space of time than three
    minutes and a quarter.

    As Oliver gave this first proof of the free and proper action of his
    lungs, the patchwork coverlet which was carelessly flung over the iron
    bedstead, rustled; the pale face of a young woman was raised feebly from
    the pillow; and a faint voice imperfectly articulated the words, 'Let me
    see the child, and die.'

    The surgeon had been sitting with his face turned towards the fire:
    giving the palms of his hands a warm and a rub alternately. As the young
    woman spoke, he rose, and advancing to the bed's head, said, with more
    kindness than might have been expected of him:

    'Oh, you must not talk about dying yet.'

    'Lor bless her dear heart, no!' interposed the nurse, hastily
    depositing in her pocket a green glass bottle, the contents of which she
    had been tasting in a corner with evident satisfaction.

    'Lor bless her dear heart, when she has lived as long as I have, sir,
    and had thirteen children of her own, and all on 'em dead except two,
    and them in the wurkus with me, she'll know better than to take on in
    that way, bless her dear heart! Think what it is to be a mother, there's
    a dear young lamb do.'

    Apparently this consolatory perspective of a mother's prospects failed
    in producing its due effect. The patient shook her head, and stretched
    out her hand towards the child.

    The surgeon deposited it in her arms. She imprinted her cold white lips
    passionately on its forehead; passed her hands over her face; gazed
    wildly round; shuddered; fell back--and died. They chafed her breast,
    hands, and temples; but the blood had stopped forever. They talked of
    hope and comfort. They had been strangers too long.

    'It's all over, Mrs. Thingummy!' said the surgeon at last.

    'Ah, poor dear, so it is!' said the nurse, picking up the cork of the
    green bottle, which had fallen out on the pillow, as she stooped to take
    up the child. 'Poor dear!'

    'You needn't mind sending up to me, if the child cries, nurse,' said the
    surgeon, putting on his gloves with great deliberation. 'It's very
    likely it WILL be troublesome. Give it a little gruel if it is.' He put
    on his hat, and, pausing by the bed-side on his way to the door, added,
    'She was a good-looking girl, too; where did she come from?'

    'She was brought here last night,' replied the old woman, 'by the
    overseer's order. She was found lying in the street. She had walked some
    distance, for her shoes were worn to pieces; but where she came from, or
    where she was going to, nobody knows.'

    The surgeon leaned over the body, and raised the left hand. 'The old
    story,' he said, shaking his head: 'no wedding-ring, I see. Ah!
    Good-night!'

    The medical gentleman walked away to dinner; and the nurse, having once
    more applied herself to the green bottle, sat down on a low chair before
    the fire, and proceeded to dress the infant.

    What an excellent example of the power of dress, young Oliver Twist was!
    Wrapped in the blanket which had hitherto formed his only covering, he
    might have been the child of a nobleman or a beggar; it would have been
    hard for the haughtiest stranger to have assigned him his proper station
    in society. But now that he was enveloped in the old calico robes which
    had grown yellow in the same service, he was badged and ticketed, and
    fell into his place at once--a parish child--the orphan of a
    workhouse--the humble, half-starved drudge--to be cuffed and buffeted
    through the world--despised by all, and pitied by none.

    Oliver cried lustily. If he could have known that he was an orphan, left
    to the tender mercies of church-wardens and overseers, perhaps he would
    have cried the louder.
    --
    Posted via http://www.ruby-forum.com/.
    Max Norman, Jul 22, 2009
    #5
  6. Max Norman

    7stud -- Guest

    Try this:


    require 'pp'

    lines = File.readlines("text.txt")
    pp lines
    puts "----"

    text = lines.join

    paragraph_count = text.split(/\n\n/).length
    puts "#{paragraph_count} paragraphs."

    What do you see?
    --
    Posted via http://www.ruby-forum.com/.
    7stud --, Jul 22, 2009
    #6
  7. Max Norman

    Max Norman Guest

    I solved the problem by saving the text as 'plain text' in Textmate.
    TextEdit was preserving the formatting from the website I copied the
    text off of.
    --
    Posted via http://www.ruby-forum.com/.
    Max Norman, Jul 22, 2009
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tony
    Replies:
    4
    Views:
    2,127
    Andy De Petter
    Nov 27, 2003
  2. Guru Nathan via JavaKB.com

    Counting no.of sentences

    Guru Nathan via JavaKB.com, Feb 28, 2005, in forum: Java
    Replies:
    13
    Views:
    792
  3. Umesh
    Replies:
    17
    Views:
    774
    James Kanze
    Apr 26, 2007
  4. Umesh
    Replies:
    25
    Views:
    1,539
    James Kanze
    Apr 26, 2007
  5. kylin
    Replies:
    1
    Views:
    527
    Chris Rebert
    Nov 4, 2009
Loading...

Share This Page