simple word wrap problem not wrapping

D

Douglas G

I've tried various ideas on this problem, but I don't see word wrapping.

Can you point out what is wrong? It's a K&R exercise, and I'm still new to
programming. Other pointers would be helpful too.

#include "header.h"
/* does the wordwrapping */

void fold(char buffer[], int len)
{
int start_point,i;

start_point=i=0;
while (len > (start_point +COLUMN)) {
i=start_point+COLUMN;
while ( buffer!=' ' || buffer!='\t')
start_point=--i;
buffer[++i]='\n';
}
start_point=i=0;

return;
}

[snippet from header.h]

/* header.h */
#include <stdio.h>
#define MAXLINE 1000 /* maximum input line size */
#define COLUMN 35 /* length before fold */
 
M

Malcolm

Douglas G said:
I've tried various ideas on this problem, but I don't see word wrapping.

Can you point out what is wrong? It's a K&R exercise, and I'm still new to
programming. Other pointers would be helpful too.

#include "header.h"
/* does the wordwrapping */
This comment is better than noting, but still way too terse. Describe what
the function is meant to do. Does it take in a string of arbitrary length
and replace spaces by newlines to achieve wrapping? If so what does it do if
passed a degenerate unwrappable string?
Why are you passing the length of the string? Does this mean the string need
not be NUL-terminated, or is it for efficiency? You need to explain.
void fold(char buffer[], int len)
{
int start_point,i;

start_point=i=0;

while (len > (start_point +COLUMN)) {
i=start_point+COLUMN;
while ( buffer!=' ' || buffer!='\t')
start_point=--i;
buffer[++i]='\n';
}
start_point=i=0;

The logic looks highly dodgy to me. Remember you can be passed any string.
return;
}

[snippet from header.h]

/* header.h */
#include <stdio.h>
#define MAXLINE 1000 /* maximum input line size */
#define COLUMN 35 /* length before fold */
 
D

Douglas G

Malcolm said:
This comment is better than noting, but still way too terse. Describe what
the function is meant to do. Does it take in a string of arbitrary length
and replace spaces by newlines to achieve wrapping? If so what does it do
if passed a degenerate unwrappable string?
Why are you passing the length of the string? Does this mean the string
need not be NUL-terminated, or is it for efficiency? You need to explain.

Basic word wrap in that at a certain column length it checks if there is
white space and replaces it with a newline. Otherwise it counts back to
the first white space and repplaces it. Then the counter should be set to
the next spot. So if the line is actually longer than several column
lengths it will catch.

Part of the exercise. The routine that catches the input places a null
terminator at the end as well as passes the length back. Saves me time
trying to find the length again.
void fold(char buffer[], int len)
{
int start_point,i;

start_point=i=0;

while (len > (start_point +COLUMN)) {
i=start_point+COLUMN;
while ( buffer!=' ' || buffer!='\t')
start_point=--i;
buffer[++i]='\n';
}
start_point=i=0;

The logic looks highly dodgy to me. Remember you can be passed any
string.

Any suggestions on it? I'm new to programming. I thought it was concise
enough. It still fails the final test, which is, "does it work?"
return;
}

[snippet from header.h]

/* header.h */
#include <stdio.h>
#define MAXLINE 1000 /* maximum input line size */
#define COLUMN 35 /* length before fold */
 
A

Arthur J. O'Dwyer

Douglas G said:
I've tried various ideas on this problem, but I don't see word wrapping.
Can you point out what is wrong? [...]
/* does the wordwrapping */
This comment is better than not[h]ing, but still way too terse.

By some people's standards, it's worse than nothing---it's a blatant
lie! After all, the whole point of the OP's post was that this function
does /not/ do word wrapping. At least, not so he could "see" it. :) A
better comment would have been

/* Tries to do wordwrapping */

followed by an explanation of why and how it actually /doesn't/ succeed.
If the OP had tried to write down a description of the bug, he probably
would have discovered (some of) the problems on his own, with no outside
help required!

HTH,
-Arthur
 
J

Joe Wright

Douglas G wrote:

Snip All

Allow me please to re-state the problem. I think what you want to do
is to re-format a text file and place the '\n' in a more convenient
place. Right?

There is no need to do this line by line. There is no need for
strings of any sort. Consider a short routine I wrote yesterday.

/*
Cursory examination of Word .doc files shows the text starting
at 0x600 bytes into the file and ending with '\0'. Line ending
is a single 0x0D character. Apple? :=)
Should be easy enough, right?
*/

#include <stdio.h>

int main(int argc, char *argv[]) {
FILE *in;
int c, w = 0, sp = 0;
in = fopen(argv[1], "rb");
if (in != NULL) {
fseek(in, 0x600, SEEK_SET);
while ((c = fgetc(in)) != '\0' && c != EOF) {
if (++w > 60) {
if (c == ' ')
sp = 1;
if (c != ' ' && c != '\r' && sp == 1) {
ungetc(c, in);
c = '\r';
}
}
if (c == '\r')
putchar('\n'), w = sp = 0;
else
putchar(c);
}
fclose(in);
}
return 0;
}

Not a string is sight. And it wraps.
 
D

Douglas G

Joe said:
Douglas G wrote:

Snip All

Allow me please to re-state the problem. I think what you want to do
is to re-format a text file and place the '\n' in a more convenient
place. Right?

There is no need to do this line by line. There is no need for
strings of any sort. Consider a short routine I wrote yesterday.
I guess I should guess give the whole story. The problem is that input is
sent straight through the program with no alterations whatsoever.

The input routine collects it, and ends the input with \n and then \0 and
returns the length of the string.

The troublesome routine doesn't do any word wrapping at all, as I have
adjusted the size in the header file to the ridiculous in order to try and
see any effects.

The routine is assuming that the length could be anywhere up to the maximum
size of the input 1000 characters. Which means it would need word wrapping
more than once until it reaches the end.

My intented approach was to take a starting point plus the row length and
start checking backwards for the first available whitespace and then
replace it with a newline. Then change the starting point to the current
position and then iterate through the loop until the next starting point
plus the row length exceeded the length of the string.

However the output looks like the program never ran because it doesn't wrap
doesn't complain, no segment fault. I've tried is as a single file program
to make sure nothing was lost by splitting things up. No changes
whatsoever. I've added the -pedantic -Wall switches and made a few
changes. So here is the complete program warts and all. Other suggestions
are welcome, since I don't have any programmers that I know of and would
welcome anything that help to program with better habits etc.

#include <stdio.h>
#define MAXLINE 1000 /* maximum input line size */
#define ROW_LENGTH 25 /* length before fold */


int getline(char input_line[], int length_of_input);
void putline(char buffer[], int line_length);
void fold (char s[], int len);


/* does a word wrap at designated spots using ROW_LENGTH. */

int main()
{
int i, len, start_point;
char line[MAXLINE];

len=start_point=i=0;
while ((len = getline( line, MAXLINE)) > 0) {

/* start of word wrap */
fold(line, len);
putline(line, len);
}
return 0;
}
/* does the wordwrapping using ROW_LENGTH as the start point
and starts back until it finds a white space and changes
it to a new line and iterates through this until it exceeds
the length passed to it */

void fold(char buffer[], int len)
{
int start_point,i;

start_point=i=0;
while (len > (start_point +ROW_LENGTH)) {
i=start_point+ROW_LENGTH;
while ( buffer!=' ' || buffer!='\t')
start_point=--i;
buffer[++i]='\n';
}
start_point=i=0;

return;
}
/* putline: displays the line */

void putline(char buffer[], int lim)
{
int i;
for (i=0; i <lim; ++i)
putchar(buffer);
return;
}
/* getline: read a line into s return length */

int getline(char s[], int lim)
{
int c,i;
for (i=0; i <MAXLINE-1 && (c=getchar())!=EOF && c!='\n'; ++i)
s=c;
if (c=='\n') {
s=c;
++i;
}
s='\0';
return i;
}
 
D

Dave Thompson

I guess I should guess give the whole story. The problem is that input is
sent straight through the program with no alterations whatsoever.
void fold(char buffer[], int len)
{
int start_point,i;

start_point=i=0;
while (len > (start_point +ROW_LENGTH)) {
i=start_point+ROW_LENGTH;
while ( buffer!=' ' || buffer!='\t')
start_point=--i;


This is your problem. You want to search for a character that is
either a space or tab; to do that you want to skip every character
that is not space AND not tab.

As written this should fault or hang if executed; the condition
buffer!=' ' || buffer!='\t' is true for every possible character
and will keep decrementing i, and also start_point, down past 0 (the
beginning of your buffer); that is already Undefined Behavior, and on
most platforms it will eventually reach nonexistent memory and trap,
or underflow the int which is also Undefined Behavior although there
are many fewer platforms where that traps; and if the range of int is
small enough (e.g. 16 bits) compared to the address space it might
just wrap around forever. However, since the condition is statically
determinable as always true, your compiler might have optimized it
away, as my gcc-2.95.3-8 did unless I make buffer volatile.

However, in 99.99% of environments, treating tab like space for
computing word wrap will give you the wrong results. (Legal,
well-defined, deterministic, but wrong.) I hope you won't mind too
much my saying that you don't yet seem up to the complexity of
handling tab correctly, so for now you might better just ignore it.

Also you should decrement only i, and check for it reaching (or going
below) start_point -- that means no there were no breakable points at
all within one wrap "span", and you have to decide what to do -- do
you allow the word(?) to violate the specified width, force a break in
the middle, or what?
buffer[++i]='\n';

You probably don't want to increment i here, with the loop condition
corrected the loop will exit with i pointing to the space (or tab);
and to be consistent you want to set start_point *after* the new \n.
}
start_point=i=0;
This is just a waste; setting local (auto) variables before you return
can never be useful.
return;
}

- David.Thompson1 at worldnet.att.net
 
D

Douglas G

Dave said:
I guess I should guess give the whole story. The problem is that input
is sent straight through the program with no alterations whatsoever.
void fold(char buffer[], int len)
{
int start_point,i;

start_point=i=0;
while (len > (start_point +ROW_LENGTH)) {
i=start_point+ROW_LENGTH;
while ( buffer!=' ' || buffer!='\t')
start_point=--i;


This is your problem. You want to search for a character that is
either a space or tab; to do that you want to skip every character
that is not space AND not tab.

As written this should fault or hang if executed; the condition
buffer!=' ' || buffer!='\t' is true for every possible character
and will keep decrementing i, and also start_point, down past 0 (the
beginning of your buffer); that is already Undefined Behavior, and on
most platforms it will eventually reach nonexistent memory and trap,
or underflow the int which is also Undefined Behavior although there
are many fewer platforms where that traps; and if the range of int is
small enough (e.g. 16 bits) compared to the address space it might
just wrap around forever. However, since the condition is statically
determinable as always true, your compiler might have optimized it
away, as my gcc-2.95.3-8 did unless I make buffer volatile.


working on this. I saw your point though.
However, in 99.99% of environments, treating tab like space for
computing word wrap will give you the wrong results. (Legal,
well-defined, deterministic, but wrong.) I hope you won't mind too
much my saying that you don't yet seem up to the complexity of
handling tab correctly, so for now you might better just ignore it.
I'm not quite sure how to handle it since the tabs are translated by the OS,
except in cases that I test where I pipe the contents of a file to the
program to avoid this translation. But you are right in that I'm not sure
how to handle those tabs in this instance.
Also you should decrement only i, and check for it reaching (or going
below) start_point -- that means no there were no breakable points at
all within one wrap "span", and you have to decide what to do -- do
you allow the word(?) to violate the specified width, force a break in
the middle, or what?

Hadn't thought of that one. Thanks.
buffer[++i]='\n';

You probably don't want to increment i here, with the loop condition
corrected the loop will exit with i pointing to the space (or tab);
and to be consistent you want to set start_point *after* the new \n.

correct me if I'm wrong and I probably am, but this statement is basically
the same as
buffer='\n';
i=i+1;

if it were buffer[i++]='\n';

I would see your point on this.

Just what little reading I have done states that it evaluates right to left
in which i would increment after it was evaluated for the expression,
placing it one after the newline.
This is just a waste; setting local (auto) variables before you return
can never be useful.
Duh, can't believe I left it in.
 
D

Dave Thompson

Dave Thompson wrote:
buffer[++i]='\n';

You probably don't want to increment i here, with the loop condition
corrected the loop will exit with i pointing to the space (or tab);
and to be consistent you want to set start_point *after* the new \n.

correct me if I'm wrong and I probably am, but this statement is basically
the same as
buffer='\n';
i=i+1;

You are wrong; see below.
if it were buffer[i++]='\n';

I would see your point on this.

Just what little reading I have done states that it evaluates right to left
in which i would increment after it was evaluated for the expression,
placing it one after the newline.

There is no general "right to left" rule and AFAIK it isn't even a
common practice. There are some things that are ordered -- the && ||
and comma operators specifically create what are called "sequence
points" which mean that one set of operations is completed before
another, as do complete statements (or declaration initializers). Most
other subexpressions of an expression can be evaluated basically in
any order the compiler finds convenient. Note that in a function call
like f(1,2,3) the commas are part of the syntax _not_ comma operators;
and order of evaluating multiple arguments is unspecified.

++x or --x is "pre{inc,dec}rement" -- it adds 1 to or subtracts 1 from
x but returns the value _before_ the change. x++ or x--
"post{inc,dec}rement" returns the value _after_ the change.

Thus b[i++] = x; is the same as b = x; i = i + 1;
while b[++i] = x; is the same as i = i + 1; b = x;
except if b and x are actually in the same memory location, in
which case the 2-statement forms are well-defined but the "embedded"
ones aren't because they have multiple stores not separated by a
sequence point; but that isn't true for your case.

The only thing I can think of that comes close to your "rule" is that
C gives (all) postfix operators highest grammatical precedence and in
particular above prefix operators -- they "bind" most tightly, and as
a result are executed first. For example, given:
static unsigned int silly_data [10];
unsigned int * silly_func ( int x )
{ return silly_data + x; /* pointer into array */ }
then ! ++ silly_func (3) [4] first calls silly_func (3) to return a
pointer, subscripts that pointer with 4 to access silly_data[7],
increments that cell's contents, and then yields true (1) if the
result is zero and false (0) otherwise. Both right-hand operators are
thus evaluated before the left-hand ones, but not right to left which
would be impossible: how could you subscript the argument list (3) and
then use the result to call the function silly_func?

- David.Thompson1 at worldnet.att.net
 
A

Arthur J. O'Dwyer

++x or --x is "pre{inc,dec}rement" -- it adds 1 to or subtracts 1 from
x but returns the value _before_ the change. x++ or x--
"post{inc,dec}rement" returns the value _after_ the change.

Backwards explanation...
Thus b[i++] = x; is the same as b = x; i = i + 1;
while b[++i] = x; is the same as i = i + 1; b = x;


...right examples. Pre-increment returns the value /after/ the
increment (the increment happens /pre/ the evaluation), and vice
versa for post-increment.
except if b and x are actually in the same memory location,


Again wrong. The UB happens if 'x' is an expression involving the
value of 'i' in some way. The snippet

int b[10], *x = &b[5], i = 5;
b = 42;
b[i++] = *x;

is perfectly well-defined; it stores 42 in 'b[5]', and then stores it
there again, incrementing 'i' to 6 in the process. The snippet

b[i++] = i;

on the other hand, is undefined.

-Arthur
 
D

Dave Thompson

[ pre versus post inc/dec ]
Backwards explanation... <snip> ...right examples. <snip>
except if b and x are actually in the same memory location,


Again wrong. The UB happens if 'x' is an expression involving the
value of 'i' in some way. <snip>


Gack! Sorry, I must have been sick that day or something.

- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top