Any way to take a word as input from stdin ?

C

CBFalconer

Keith said:
.... snip ...


Richard no-last-name has made a hobby out of insulting Chuck Falconer
at every opportunity, even dragging his name into discussions in which
Chuck has not participated.

It is totally pointless, since I have Richard the un-named PLONKed,
and I never see his silly diatribes.
 
C

CBFalconer

James said:
.... snip ...

The only way he knows how to clearly describe what he wants his
code to do is by providing a C++ example; this has been made
abundantly clear by his failed attempts to clearly describe it in
English. However, the code he wants to write should be in C.

If he were to post this same question to comp.lang.c++, and there
were a C++BFalconer on comp.lang.c++, C++BFalconer would certainly
respond by saying that this C question is off-topic in
comp.lang.c++. Should arnuld then simply remain silent about his
question?

I disagree. If he wants to use a C++ algorithm as illustration he
should translate that algorithm to C. In fact, a good example
would be a lexer for a C compiler.
 
C

CBFalconer

arnuld said:
.... snip ...

okay, that seems a good reply. I mean, we make it topical to C
again as I lost in the confusion a little. so *my* definition of
word will be the same one yo told earlier:

A "word" is a non-empty contiguous sequence of characters other
than space, tab, or newline, preceded or followed either by a
space, tab, or newline or by the start or end of the input.

So any sequence of control chars, such as '\16', '\17' can go in a
word? Just illustrating the difficulties. For examples
identifiers in C start with 'a'..'z', 'A'..'Z', '_', and continue
with the same plus '0'..'9'. This assumes (not valid for C) that
'a'..'z' are contiguous, as are 'A'..'Z'. When the word has been
parsed it has to be checked against a (limited) list of reserved
words.
 
J

jameskuyper

CBFalconer said:
I disagree. If he wants to use a C++ algorithm as illustration he
should translate that algorithm to C. In fact, a good example
would be a lexer for a C compiler.

His question was basically about how to translate the C++ algorithm to
C. So what you're saying is that he must answer his own question
before he can ask it here? I'm curious, where do you think he should
go to get help with the translation, since you've ruled out coming
here for help with it; and C++BFalconer would presumably rule out
going to clc++ for such a question? And when he finally does ask it,
according to you, his question is required to take the form "How do I
translate this algorithm {algorithm already translated into C}, into
C?". That's patently ridiculous.
 
K

Keith Thompson

CBFalconer said:
arnuld wrote:
... snip ...

So any sequence of control chars, such as '\16', '\17' can go in a
word? Just illustrating the difficulties. For examples
identifiers in C start with 'a'..'z', 'A'..'Z', '_', and continue
with the same plus '0'..'9'. This assumes (not valid for C) that
'a'..'z' are contiguous, as are 'A'..'Z'. When the word has been
parsed it has to be checked against a (limited) list of reserved
words.

Yes, given the definition above, this string:

"\16 \17"

contains two "words". Are you suggesting that that's a problem?

Obviously a program that's intended to recognize C identifiers would
have to use a different rule. But the OP didn't say anything about C
identifiers, so I'm not sure why you're bringing them up.

Incidentally, on my initial reading of your followup, I thought your
use of the word "contiguous" was meant to be related to the use in
arnuld's definition of "word" (the one I had suggested earlier). In
fact, they're quite different; in the definition of "word" it refers
to the characters being adjacent in the input, not to their numeric
representations. A more careful reading of what you wrote indicates
that you just meant that the notation 'a'..'z' doesn't make sense
unless the representations of those characters are numerically
contiguous. I thought I should point this out in case anyone else is
confused.
 
J

jameskuyper

CBFalconer said:
So any sequence of control chars, such as '\16', '\17' can go in a
word? Just illustrating the difficulties. For examples
identifiers in C start with 'a'..'z', 'A'..'Z', '_', and continue
with the same plus '0'..'9'. ...

If you're going to bother pointing out that
... This assumes (not valid for C) that
'a'..'z' are contiguous, as are 'A'..'Z'.

then you shouldn't be assuming that '\16' and '\17' are control
characters; if we're not assuming ASCII, then they just might
represent ' ' and 'a', respectively.

Identifying what arnuld calls "words" is much simpler than identifying
C identifiers; in fact, I can't quite figure out why you bothered
bringing up the definition of C identifiers. All that arnuld's code
needs to do is check for the delimiting " \t\n" characters. In fact,
since he has said that he wants to mimic the behavior of the C++ code
which he provided as an example, he probably left out out the form-
feed, carriage return, and vertical tab characters only by accident.
If he adds "\f\r\v" to the delimiter list, then the simplest way to
handle the delimiter check is to just call isspace().
 
A

Antoninus Twink

It is totally pointless, since I have Richard the un-named PLONKed,
and I never see his silly diatribes.

Fortunate, then, that your posts have become so embarrassingly
error-ridden in recent months that another Richard, with a surname we
all know only too well, has started posting similar diatribes that you
surely *do* read.
 
K

Kenny McCormack

Fortunate, then, that your posts have become so embarrassingly
error-ridden in recent months that another Richard, with a surname we
all know only too well, has started posting similar diatribes that you
surely *do* read.

Of course, now, KT himself has gotten on the bashing CBF bandwagon.

Good on him!
 
O

Old Wolf

arnuld said:

Let me give you an example from ordinary English, where whitespace
delimiters are not sufficient:

"What did he say?", said Albert.
"He just said, 'I'll be there', I think", replied the captain.

Now, consider the whitespace-separated tokens:

A bit sidetracked from the original thread, but is
there actually any problem here besides identifying
whether a ' symbol is a quote mark or an apostrophe?
 
C

CBFalconer

Old said:
.... snip ...


A bit sidetracked from the original thread, but is
there actually any problem here besides identifying
whether a ' symbol is a quote mark or an apostrophe?

And I gather you consider that a trivial problem? Please describe
your algorithm.
 
C

CBFalconer

His question was basically about how to translate the C++ algorithm to
C. So what you're saying is that he must answer his own question
before he can ask it here? I'm curious, where do you think he should
go to get help with the translation, since you've ruled out coming
here for help with it; and C++BFalconer would presumably rule out
going to clc++ for such a question? And when he finally does ask it,
according to you, his question is required to take the form "How do I
translate this algorithm {algorithm already translated into C}, into
C?". That's patently ridiculous.

You certainly make a good point.
 
C

CBFalconer

Keith said:
Yes, given the definition above, this string:

"\16 \17"

contains two "words". Are you suggesting that that's a problem?

I didn't specify a string. I meant those characters contiguous
(i.e. one strictly following the other) in the input stream. The
detection I specified above can be done with one char look ahead.
The presence (and necessity) of such a look ahead scheme may not be
obvious to the casual reader. In C it revolves around the ungetc()
function.
Obviously a program that's intended to recognize C identifiers would
have to use a different rule. But the OP didn't say anything about C
identifiers, so I'm not sure why you're bringing them up.

Incidentally, on my initial reading of your followup, I thought your
use of the word "contiguous" was meant to be related to the use in
arnuld's definition of "word" (the one I had suggested earlier). In
fact, they're quite different; in the definition of "word" it refers
to the characters being adjacent in the input, not to their numeric
representations. A more careful reading of what you wrote indicates
that you just meant that the notation 'a'..'z' doesn't make sense
unless the representations of those characters are numerically
contiguous. I thought I should point this out in case anyone else is
confused.

Right. I should have specified 'the values of the chars are
contiguous'. The point being that ASCII works fine, but EBCDIC
doesn't. The C lexer will be a good example, because what it has
to detect is well defined.
 
A

arnuld

I was being a bit vague. Lets leave actual array pointers out of
this. I mean that Richard was talking about changing the char ** as
seen from the calling function. The thing you are intending to pass,
a char **, is in some sense a pointer to the whole array: from it all
of the array's data is accessible. The trouble is you can can't
change this char ** inside the function -- not in a way that has any
effect outside. All you can do is change the various things it points
to.

If a function needs to change an int, you pass an int *. If it needs
to change int *, you pass an int **. If it needs to change and int **
you must pass an int ***.
... SNIP....
Typo! I meant you *can't* write any value into **ppc! Sorry. There
are two typos, I now see. It should have read: "*ppc is NULL -- you
set it to be NULL before the call. You can't write any value into
**ppc."


see my new post titled "pointers passed by copying ?"
 
O

Old Wolf

And I gather you consider that a trivial problem?  Please describe
your algorithm.

Not at all, I was just checking that there
wasn't some other problem besides this one,
that I hadn't seen.
 
O

Old Wolf

but now you have to list in your dictionary every single character
combination that you consider to be a word. Big dictionary. (For a start,
every word will need at least three entries: "word", "Word", "WORD".)

The dictionary approach is clumsy in the extreme, and the algorithmic
approach gets more and more difficult as you get pickier and pickier about
what does and what does not constitute a word.

Surely there is no approach other than using
a sophisticated dictionary. For example:

'Tis the season to be playin'

there is no rule to deduce whether we have
quote marks or apostrophes, besides knowing
that 'Tis is a word.

The dictionary can includes rules such as
the fact that if "abcd" is a word, then
so is "Abcd"; it can know that acronyms
can be written with periods, and so on.

Now where it gets harder is if you have to
accept text from people who make spelling
mistakes and typoes :)
 
A

arnuld

...SNIP...
But what if something goes wrong? You'll need to be able to report an
error. The natural way to do this is via a return value, which means we
can't use that value for either the list or the count, and that leads us
to:

what we will do with that return value ? If something wrong occurs I can
simply exit the program telling the user that he did some thing stupid and
he is responsible for that.


int get_words(char ***, size_t *);

Since they don't need to modify the caller's status, sort_words and
print_words can be of type int(char **, size_t).


I think there is qsort in std. lib. , hence we can use that but I don't
know whether it modifies the original array or not.



Up to you, but I wouldn't bother setting a limit (or, if I did, I'd set it
at a million or so, and treat any string longer than that as a reportable
error). With dynamic allocation, you don't /need/ to set a limit; you
simply allocate as you go, and reallocate if necessary.



okay, I will write the program in parts. First we will write a simple
program that will ask the user to input and we will store that word
dynamically using calloc in some array. It will be called get_single_word
and it will form the basis of get_words function which will store all
words in an array. get_single_word returns an int because I want to use
get_single_word in get_words like this:

while( get_single_word )
{
/* code for get_words */
}


Here is my code for get_single_word. PROBLEM: it does not print anything
I entered:


/* a program to get a single word from stdin */


#include <stdio.h>
#include <stdlib.h>

enum { AVERAGE_SIZE = 28 };


int get_single_word( char* );

int main( void )
{
char* pw; /* pw means pointer to word */


get_single_word( pw );

printf("word you entered is: %s\n", pw);

return 0;
}



int get_single_word( char* pc )
{
int idx;
int ch;
char *pc_begin;

pc = calloc(AVERAGE_SIZE-1, sizeof(char));
pc_begin = pc;

if( (! pc) )
{
perror("can not allocate memory, sorry babe!");
return 1;
}

for( idx = 0; ( (ch = getchar()) != EOF ); ++idx, ++pc )
{
if( AVERAGE_SIZE == idx )
{
/* use realloc here which I have no idea how to write */
}

*pc = ch;
}

*++pc = '\0';
free(pc_begin);

return 0;
}

=================== OUTPUT ==================
[arnuld@dune ztest]$ gcc -ansi -pedantic -Wall -Wextra test.c
[arnuld@dune ztest]$ ./a.out
like
word you entered is:
[arnuld@dune ztest]$
 
A

arnuld

.... SNIP...
Here is my code for get_single_word. PROBLEM: it does not print anything
I entered:
.... SNIP...


I have even tried using pointer to pointer but that still leaves me with
the same problem:


int main( void )
{
char* pw; /* pw means pointer to word */


get_single_word( &pw );

printf("word you entered is: %s\n", pw);

return 0;
}



int get_single_word( char** pc )
{
int idx;
int ch;
char *pc_begin;

*pc = calloc(AVERAGE_SIZE-1, sizeof(char));
pc_begin = *pc;

if( (! *pc) )
{
perror("can not allocate memory, sorry babe!");
return 1;
}

for( idx = 0; ( (ch = getchar()) != EOF ); ++idx, ++*pc )
{
if( AVERAGE_SIZE == idx )
{
/* use realloc here which I have no idea how to write */
}

**pc = ch;
}

*++pc = '\0';
free(pc_begin);

return 0;
}
 
R

Ron Ford

Old Wolf said:


I think it's about here that I like to pretend I'm from Missouri.

Show me.

As it polls redder with the Palin nomination, Huck sighed, 'Ashcroft
sucks."
 
A

arnuld

Yes, you could do that, except that (a) it might not be the user's stupid
fault (it may simply be that your machine is low on memory), and (b) there
may be a way to recover. If this is a mere learning exercise and the
learning task is not error recovery, then yes, by all means bomb out.
That's the "student solution" and, like cryptosporidium, is very common.

http://en.wikipedia.org/wiki/Cryptosporidium

...aye.... , so lets learn the practical aspects like error-recovery too. I
don't like academic solutions BTW
 
N

Nick Keighley

On Tue, 16 Sep 2008 06:28:30 +0000, Richard Heathfield posted:


As it polls redder with the Palin nomination, Huck sighed, 'Ashcroft
sucks."

if you have a political point to make then do it in english.
Or, better, don't make the point on a technical news group.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,250
Latest member
Charlesreero

Latest Threads

Top