the de-facto way to "parse" input

K

Krumble Bunk

Hi all,

I am trying my hands at writing a shell for unix. A very rubbish
shell, but nonetheless, I come to a point where I am confused.

I would like to have something like

shell> stop xyz

whereupon the command "stop" will take the argument "xyz" and perform
foo action on it. What in your opinion is the best (easiest?) way to
validate the input (perhaps iterate over a "valid commands" table),
and what calls would you use? (getc(), scanf(), a big while loop and
pointer arithmetic, ...)

I hope this question is not too ambiguous

many thanks
kb
 
C

Chris Dollin

Krumble said:
I am trying my hands at writing a shell for unix. A very rubbish
shell, but nonetheless, I come to a point where I am confused.

I would like to have something like

shell> stop xyz

whereupon the command "stop" will take the argument "xyz" and perform
foo action on it. What in your opinion is the best (easiest?) way to
validate the input (perhaps iterate over a "valid commands" table),
and what calls would you use? (getc(), scanf(), a big while loop and
pointer arithmetic, ...)

I'd use `fgets` to read the line, expanding the buffer as necessary,
carve the line up into space-separated chunks (if we're doing a rubbish
shell, we won't worry about quoting ...) and then I can look the first
chunk up in a table.

If we want something less rubbish, I'd write a recursive-descent
parser for commands. That would force me to be explicit about the
grammar I'm using and what my tokens are supposed to be. I'd build
an abstract-syntax tree for the commands; on no account would I
try and execute them while parsing them.

And I'd write unit tests. Lots of unit tests. And get something
working end-to-end as soon as possible. (Because it's very
disheartening spending a day or more writing a Super Duper
Program That Does It All, and then spending a week or more
debugging it until it does /something/, as opposed to writing
the smallest program one can manage that recognisably does
something right. Like, read a command line in, and print out
the tokens, /and do nothing else/.)

And likely throw away the first attempt, as a learning exercise.
 
K

Krumble Bunk

Krumble Bunk wrote:

[.....]


the tokens, /and do nothing else/.)

And likely throw away the first attempt, as a learning exercise.

--
"I don't make decisions. I'm a bird." /A Fine and Private Place/

Hewlett-Packard Limited Cain Road, Bracknell, registered no:
registered office: Berks RG12 1HN 690597 England


Very good advice - I will investigate using lex/yacc.

thanks

kb
 
R

rahul

Krumble Bunk wrote:
[.....]

the tokens, /and do nothing else/.)
And likely throw away the first attempt, as a learning exercise.
Hewlett-Packard Limited Cain Road, Bracknell, registered no:
registered office: Berks RG12 1HN 690597 England

Very good advice - I will investigate using lex/yacc.

thanks

kb

lex is by-large the de-facto way for tokenizing. I believe gcc makes
extensive use of lex/yacc ( or may be flex/bison but that does not
make a hell of a difference )
 
B

Bartc

If we want something less rubbish, I'd write a recursive-descent
parser for commands. That would force me to be explicit about the
grammar I'm using and what my tokens are supposed to be. I'd build
an abstract-syntax tree for the commands; on no account would I
try and execute them while parsing them.

An AST for a simple command-line interpreter?

How complex would this shell have to be to make this worthwhile?
(Because it's very disheartening spending a day or more writing

Might be more disheartening to start a huge project that doesn't finish
because it's overspecified.

I would have suggesting starting with something like the following,
replacing the system() call (and perhaps adjusting the parameter) with the
local equivalent. This would require the handler for each 'command' to be a
separate C program, but has the advantage of the parameters being already
separated.

A bit more work and the commands and parameters can be identified and
executed in the same program.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void main()
{
#define llength 1000
char line[llength];
int i,n;

puts("Type exit to exit.");
puts("");

while (1) {

printf("Prompt>");
fflush(stdout);

if (fgets(line,llength,stdin)==NULL) break;

n=strlen(line); /* get rid of troublesome trailing \n */
if (line[n-1]=='\n') line[n-1]=0;

if (strcmp(line,"exit")==0) break;

if (line[0])
system(line);
};

}
 
V

vippstar

An AST for a simple command-line interpreter?

How complex would this shell have to be to make this worthwhile?


Might be more disheartening to start a huge project that doesn't finish
because it's overspecified.

I would have suggesting starting with something like the following,
replacing the system() call (and perhaps adjusting the parameter) with the
local equivalent. This would require the handler for each 'command' to be a
separate C program, but has the advantage of the parameters being already
separated.

A bit more work and the commands and parameters can be identified and
executed in the same program.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void main()
<snip>
Bartc, haven't you been here long enough to remember main returns int?
 
B

Bart

<snip>
Bartc, haven't you been here long enough to remember main returns int?- Hide quoted text -

Yes, but I suspect I didn't write that bit. Probably the remnants of a
copy&paste of someone else's code. Not my fault at all..
 
C

Chris Dollin

Bartc said:
An AST for a simple command-line interpreter?

"something less rubbish" allows for something that isn't simple.

ASTs aren't complicated, even in C.
How complex would this shell have to be to make this worthwhile?

Pipes, sequencing, commands. Brackets and built-in commands,
definitely.
Might be more disheartening to start a huge project that doesn't finish
because it's overspecified.

Where did "huge" come from? And "overspecified"?
 
B

Bart

"something less rubbish" allows for something that isn't simple.

ASTs aren't complicated, even in C.

Pipes, sequencing, commands. Brackets and built-in commands,
definitely.

I'm not familiar with unix shells. But I don't remember seeing
anything more complicated than a linear series of commands, filenames,
numbers and switches in Windows' shell. But then, maybe Windows' shell
is rubbish.
Where did "huge" come from? And "overspecified"?

OK not huge. But I associate ASTs with compilers, and that would seem
an overkill for this task.

Perhaps the OP should start by writing the specifications of his/her
syntax, then it might become clearer which approach is best.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top