simple word wrap problem not wrapping

Discussion in 'C Programming' started by Douglas G, Aug 31, 2004.

  1. Douglas G

    Douglas G Guest

    I've tried various ideas on this problem, but I don't see word wrapping.

    Can you point out what is wrong? It's a K&R exercise, and I'm still new to
    programming. Other pointers would be helpful too.

    #include "header.h"
    /* does the wordwrapping */

    void fold(char buffer[], int len)
    {
    int start_point,i;

    start_point=i=0;
    while (len > (start_point +COLUMN)) {
    i=start_point+COLUMN;
    while ( buffer!=' ' || buffer!='\t')
    start_point=--i;
    buffer[++i]='\n';
    }
    start_point=i=0;

    return;
    }

    [snippet from header.h]

    /* header.h */
    #include <stdio.h>
    #define MAXLINE 1000 /* maximum input line size */
    #define COLUMN 35 /* length before fold */
     
    Douglas G, Aug 31, 2004
    #1
    1. Advertising

  2. Douglas G

    Malcolm Guest

    "Douglas G" <> wrote
    >
    > I've tried various ideas on this problem, but I don't see word wrapping.
    >
    > Can you point out what is wrong? It's a K&R exercise, and I'm still new

    to
    > programming. Other pointers would be helpful too.
    >
    > #include "header.h"
    > /* does the wordwrapping */
    >

    This comment is better than noting, but still way too terse. Describe what
    the function is meant to do. Does it take in a string of arbitrary length
    and replace spaces by newlines to achieve wrapping? If so what does it do if
    passed a degenerate unwrappable string?
    Why are you passing the length of the string? Does this mean the string need
    not be NUL-terminated, or is it for efficiency? You need to explain.
    >
    > void fold(char buffer[], int len)
    > {
    > int start_point,i;
    >
    > start_point=i=0;
    >
    > while (len > (start_point +COLUMN)) {
    > i=start_point+COLUMN;
    > while ( buffer!=' ' || buffer!='\t')
    > start_point=--i;
    > buffer[++i]='\n';
    > }
    > start_point=i=0;
    >

    The logic looks highly dodgy to me. Remember you can be passed any string.
    >
    >
    > return;
    > }
    >
    > [snippet from header.h]
    >
    > /* header.h */
    > #include <stdio.h>
    > #define MAXLINE 1000 /* maximum input line size */
    > #define COLUMN 35 /* length before fold */
    >
    >
    >
     
    Malcolm, Aug 31, 2004
    #2
    1. Advertising

  3. Douglas G

    Douglas G Guest

    Malcolm wrote:

    >
    > "Douglas G" <> wrote
    >>
    >> I've tried various ideas on this problem, but I don't see word wrapping.
    >>
    >> Can you point out what is wrong? It's a K&R exercise, and I'm still new

    > to
    >> programming. Other pointers would be helpful too.
    >>
    >> #include "header.h"
    >> /* does the wordwrapping */
    >>

    > This comment is better than noting, but still way too terse. Describe what
    > the function is meant to do. Does it take in a string of arbitrary length
    > and replace spaces by newlines to achieve wrapping? If so what does it do
    > if passed a degenerate unwrappable string?
    > Why are you passing the length of the string? Does this mean the string
    > need not be NUL-terminated, or is it for efficiency? You need to explain.


    Basic word wrap in that at a certain column length it checks if there is
    white space and replaces it with a newline. Otherwise it counts back to
    the first white space and repplaces it. Then the counter should be set to
    the next spot. So if the line is actually longer than several column
    lengths it will catch.

    Part of the exercise. The routine that catches the input places a null
    terminator at the end as well as passes the length back. Saves me time
    trying to find the length again.
    >>
    >> void fold(char buffer[], int len)
    >> {
    >> int start_point,i;
    >>
    >> start_point=i=0;
    >>
    >> while (len > (start_point +COLUMN)) {
    >> i=start_point+COLUMN;
    >> while ( buffer!=' ' || buffer!='\t')
    >> start_point=--i;
    >> buffer[++i]='\n';
    >> }
    >> start_point=i=0;
    >>

    > The logic looks highly dodgy to me. Remember you can be passed any
    > string.
    >>
    >>

    Any suggestions on it? I'm new to programming. I thought it was concise
    enough. It still fails the final test, which is, "does it work?"
    >> return;
    >> }
    >>
    >> [snippet from header.h]
    >>
    >> /* header.h */
    >> #include <stdio.h>
    >> #define MAXLINE 1000 /* maximum input line size */
    >> #define COLUMN 35 /* length before fold */
    >>
    >>
    >>
     
    Douglas G, Aug 31, 2004
    #3
  4. On Tue, 31 Aug 2004, Malcolm wrote:
    >
    > "Douglas G" <> wrote
    >> I've tried various ideas on this problem, but I don't see word wrapping.
    >> Can you point out what is wrong?

    [...]
    >> /* does the wordwrapping */
    >>

    > This comment is better than not[h]ing, but still way too terse.


    By some people's standards, it's worse than nothing---it's a blatant
    lie! After all, the whole point of the OP's post was that this function
    does /not/ do word wrapping. At least, not so he could "see" it. :) A
    better comment would have been

    /* Tries to do wordwrapping */

    followed by an explanation of why and how it actually /doesn't/ succeed.
    If the OP had tried to write down a description of the bug, he probably
    would have discovered (some of) the problems on his own, with no outside
    help required!

    HTH,
    -Arthur
     
    Arthur J. O'Dwyer, Aug 31, 2004
    #4
  5. Douglas G

    Joe Wright Guest

    Douglas G wrote:

    Snip All

    Allow me please to re-state the problem. I think what you want to do
    is to re-format a text file and place the '\n' in a more convenient
    place. Right?

    There is no need to do this line by line. There is no need for
    strings of any sort. Consider a short routine I wrote yesterday.

    /*
    Cursory examination of Word .doc files shows the text starting
    at 0x600 bytes into the file and ending with '\0'. Line ending
    is a single 0x0D character. Apple? :=)
    Should be easy enough, right?
    */

    #include <stdio.h>

    int main(int argc, char *argv[]) {
    FILE *in;
    int c, w = 0, sp = 0;
    in = fopen(argv[1], "rb");
    if (in != NULL) {
    fseek(in, 0x600, SEEK_SET);
    while ((c = fgetc(in)) != '\0' && c != EOF) {
    if (++w > 60) {
    if (c == ' ')
    sp = 1;
    if (c != ' ' && c != '\r' && sp == 1) {
    ungetc(c, in);
    c = '\r';
    }
    }
    if (c == '\r')
    putchar('\n'), w = sp = 0;
    else
    putchar(c);
    }
    fclose(in);
    }
    return 0;
    }

    Not a string is sight. And it wraps.
    --
    Joe Wright mailto:
    "Everything should be made as simple as possible, but not simpler."
    --- Albert Einstein ---
     
    Joe Wright, Sep 1, 2004
    #5
  6. Douglas G

    Douglas G Guest

    Joe Wright wrote:

    > Douglas G wrote:
    >
    > Snip All
    >
    > Allow me please to re-state the problem. I think what you want to do
    > is to re-format a text file and place the '\n' in a more convenient
    > place. Right?
    >
    > There is no need to do this line by line. There is no need for
    > strings of any sort. Consider a short routine I wrote yesterday.
    >

    I guess I should guess give the whole story. The problem is that input is
    sent straight through the program with no alterations whatsoever.

    The input routine collects it, and ends the input with \n and then \0 and
    returns the length of the string.

    The troublesome routine doesn't do any word wrapping at all, as I have
    adjusted the size in the header file to the ridiculous in order to try and
    see any effects.

    The routine is assuming that the length could be anywhere up to the maximum
    size of the input 1000 characters. Which means it would need word wrapping
    more than once until it reaches the end.

    My intented approach was to take a starting point plus the row length and
    start checking backwards for the first available whitespace and then
    replace it with a newline. Then change the starting point to the current
    position and then iterate through the loop until the next starting point
    plus the row length exceeded the length of the string.

    However the output looks like the program never ran because it doesn't wrap
    doesn't complain, no segment fault. I've tried is as a single file program
    to make sure nothing was lost by splitting things up. No changes
    whatsoever. I've added the -pedantic -Wall switches and made a few
    changes. So here is the complete program warts and all. Other suggestions
    are welcome, since I don't have any programmers that I know of and would
    welcome anything that help to program with better habits etc.

    #include <stdio.h>
    #define MAXLINE 1000 /* maximum input line size */
    #define ROW_LENGTH 25 /* length before fold */


    int getline(char input_line[], int length_of_input);
    void putline(char buffer[], int line_length);
    void fold (char s[], int len);


    /* does a word wrap at designated spots using ROW_LENGTH. */

    int main()
    {
    int i, len, start_point;
    char line[MAXLINE];

    len=start_point=i=0;
    while ((len = getline( line, MAXLINE)) > 0) {

    /* start of word wrap */
    fold(line, len);
    putline(line, len);
    }
    return 0;
    }
    /* does the wordwrapping using ROW_LENGTH as the start point
    and starts back until it finds a white space and changes
    it to a new line and iterates through this until it exceeds
    the length passed to it */

    void fold(char buffer[], int len)
    {
    int start_point,i;

    start_point=i=0;
    while (len > (start_point +ROW_LENGTH)) {
    i=start_point+ROW_LENGTH;
    while ( buffer!=' ' || buffer!='\t')
    start_point=--i;
    buffer[++i]='\n';
    }
    start_point=i=0;

    return;
    }
    /* putline: displays the line */

    void putline(char buffer[], int lim)
    {
    int i;
    for (i=0; i <lim; ++i)
    putchar(buffer);
    return;
    }
    /* getline: read a line into s return length */

    int getline(char s[], int lim)
    {
    int c,i;
    for (i=0; i <MAXLINE-1 && (c=getchar())!=EOF && c!='\n'; ++i)
    s=c;
    if (c=='\n') {
    s=c;
    ++i;
    }
    s='\0';
    return i;
    }
     
    Douglas G, Sep 1, 2004
    #6
  7. On Wed, 01 Sep 2004 02:33:34 GMT, Douglas G <>
    wrote:
    <snip>
    > I guess I should guess give the whole story. The problem is that input is
    > sent straight through the program with no alterations whatsoever.

    <snip>
    > void fold(char buffer[], int len)
    > {
    > int start_point,i;
    >
    > start_point=i=0;
    > while (len > (start_point +ROW_LENGTH)) {
    > i=start_point+ROW_LENGTH;
    > while ( buffer!=' ' || buffer!='\t')
    > start_point=--i;


    This is your problem. You want to search for a character that is
    either a space or tab; to do that you want to skip every character
    that is not space AND not tab.

    As written this should fault or hang if executed; the condition
    buffer!=' ' || buffer!='\t' is true for every possible character
    and will keep decrementing i, and also start_point, down past 0 (the
    beginning of your buffer); that is already Undefined Behavior, and on
    most platforms it will eventually reach nonexistent memory and trap,
    or underflow the int which is also Undefined Behavior although there
    are many fewer platforms where that traps; and if the range of int is
    small enough (e.g. 16 bits) compared to the address space it might
    just wrap around forever. However, since the condition is statically
    determinable as always true, your compiler might have optimized it
    away, as my gcc-2.95.3-8 did unless I make buffer volatile.

    However, in 99.99% of environments, treating tab like space for
    computing word wrap will give you the wrong results. (Legal,
    well-defined, deterministic, but wrong.) I hope you won't mind too
    much my saying that you don't yet seem up to the complexity of
    handling tab correctly, so for now you might better just ignore it.

    Also you should decrement only i, and check for it reaching (or going
    below) start_point -- that means no there were no breakable points at
    all within one wrap "span", and you have to decide what to do -- do
    you allow the word(?) to violate the specified width, force a break in
    the middle, or what?

    > buffer[++i]='\n';


    You probably don't want to increment i here, with the loop condition
    corrected the loop will exit with i pointing to the space (or tab);
    and to be consistent you want to set start_point *after* the new \n.

    > }
    > start_point=i=0;
    >

    This is just a waste; setting local (auto) variables before you return
    can never be useful.

    > return;
    > }


    - David.Thompson1 at worldnet.att.net
     
    Dave Thompson, Sep 8, 2004
    #7
  8. Douglas G

    Douglas G Guest

    Dave Thompson wrote:

    > On Wed, 01 Sep 2004 02:33:34 GMT, Douglas G <>
    > wrote:
    > <snip>
    >> I guess I should guess give the whole story. The problem is that input
    >> is sent straight through the program with no alterations whatsoever.

    > <snip>
    >> void fold(char buffer[], int len)
    >> {
    >> int start_point,i;
    >>
    >> start_point=i=0;
    >> while (len > (start_point +ROW_LENGTH)) {
    >> i=start_point+ROW_LENGTH;
    >> while ( buffer!=' ' || buffer!='\t')
    >> start_point=--i;

    >
    > This is your problem. You want to search for a character that is
    > either a space or tab; to do that you want to skip every character
    > that is not space AND not tab.
    >
    > As written this should fault or hang if executed; the condition
    > buffer!=' ' || buffer!='\t' is true for every possible character
    > and will keep decrementing i, and also start_point, down past 0 (the
    > beginning of your buffer); that is already Undefined Behavior, and on
    > most platforms it will eventually reach nonexistent memory and trap,
    > or underflow the int which is also Undefined Behavior although there
    > are many fewer platforms where that traps; and if the range of int is
    > small enough (e.g. 16 bits) compared to the address space it might
    > just wrap around forever. However, since the condition is statically
    > determinable as always true, your compiler might have optimized it
    > away, as my gcc-2.95.3-8 did unless I make buffer volatile.
    >


    working on this. I saw your point though.

    > However, in 99.99% of environments, treating tab like space for
    > computing word wrap will give you the wrong results. (Legal,
    > well-defined, deterministic, but wrong.) I hope you won't mind too
    > much my saying that you don't yet seem up to the complexity of
    > handling tab correctly, so for now you might better just ignore it.
    >

    I'm not quite sure how to handle it since the tabs are translated by the OS,
    except in cases that I test where I pipe the contents of a file to the
    program to avoid this translation. But you are right in that I'm not sure
    how to handle those tabs in this instance.

    > Also you should decrement only i, and check for it reaching (or going
    > below) start_point -- that means no there were no breakable points at
    > all within one wrap "span", and you have to decide what to do -- do
    > you allow the word(?) to violate the specified width, force a break in
    > the middle, or what?


    Hadn't thought of that one. Thanks.
    >
    >> buffer[++i]='\n';

    >
    > You probably don't want to increment i here, with the loop condition
    > corrected the loop will exit with i pointing to the space (or tab);
    > and to be consistent you want to set start_point *after* the new \n.


    correct me if I'm wrong and I probably am, but this statement is basically
    the same as
    buffer='\n';
    i=i+1;

    if it were buffer[i++]='\n';

    I would see your point on this.

    Just what little reading I have done states that it evaluates right to left
    in which i would increment after it was evaluated for the expression,
    placing it one after the newline.
    >
    >> }
    >> start_point=i=0;
    >>

    > This is just a waste; setting local (auto) variables before you return
    > can never be useful.
    >

    Duh, can't believe I left it in.
     
    Douglas G, Sep 8, 2004
    #8
  9. On Wed, 08 Sep 2004 22:03:36 GMT, Douglas G <>
    wrote:

    > Dave Thompson wrote:

    <snip>
    > >> buffer[++i]='\n';

    > >
    > > You probably don't want to increment i here, with the loop condition
    > > corrected the loop will exit with i pointing to the space (or tab);
    > > and to be consistent you want to set start_point *after* the new \n.

    >
    > correct me if I'm wrong and I probably am, but this statement is basically
    > the same as
    > buffer='\n';
    > i=i+1;
    >

    You are wrong; see below.

    > if it were buffer[i++]='\n';
    >
    > I would see your point on this.
    >
    > Just what little reading I have done states that it evaluates right to left
    > in which i would increment after it was evaluated for the expression,
    > placing it one after the newline.


    There is no general "right to left" rule and AFAIK it isn't even a
    common practice. There are some things that are ordered -- the && ||
    and comma operators specifically create what are called "sequence
    points" which mean that one set of operations is completed before
    another, as do complete statements (or declaration initializers). Most
    other subexpressions of an expression can be evaluated basically in
    any order the compiler finds convenient. Note that in a function call
    like f(1,2,3) the commas are part of the syntax _not_ comma operators;
    and order of evaluating multiple arguments is unspecified.

    ++x or --x is "pre{inc,dec}rement" -- it adds 1 to or subtracts 1 from
    x but returns the value _before_ the change. x++ or x--
    "post{inc,dec}rement" returns the value _after_ the change.

    Thus b[i++] = x; is the same as b = x; i = i + 1;
    while b[++i] = x; is the same as i = i + 1; b = x;
    except if b and x are actually in the same memory location, in
    which case the 2-statement forms are well-defined but the "embedded"
    ones aren't because they have multiple stores not separated by a
    sequence point; but that isn't true for your case.

    The only thing I can think of that comes close to your "rule" is that
    C gives (all) postfix operators highest grammatical precedence and in
    particular above prefix operators -- they "bind" most tightly, and as
    a result are executed first. For example, given:
    static unsigned int silly_data [10];
    unsigned int * silly_func ( int x )
    { return silly_data + x; /* pointer into array */ }
    then ! ++ silly_func (3) [4] first calls silly_func (3) to return a
    pointer, subscripts that pointer with 4 to access silly_data[7],
    increments that cell's contents, and then yields true (1) if the
    result is zero and false (0) otherwise. Both right-hand operators are
    thus evaluated before the left-hand ones, but not right to left which
    would be impossible: how could you subscript the argument list (3) and
    then use the result to call the function silly_func?

    - David.Thompson1 at worldnet.att.net
     
    Dave Thompson, Sep 13, 2004
    #9
  10. On Mon, 13 Sep 2004, Dave Thompson wrote:
    >
    > ++x or --x is "pre{inc,dec}rement" -- it adds 1 to or subtracts 1 from
    > x but returns the value _before_ the change. x++ or x--
    > "post{inc,dec}rement" returns the value _after_ the change.


    Backwards explanation...

    > Thus b[i++] = x; is the same as b = x; i = i + 1;
    > while b[++i] = x; is the same as i = i + 1; b = x;


    ...right examples. Pre-increment returns the value /after/ the
    increment (the increment happens /pre/ the evaluation), and vice
    versa for post-increment.

    > except if b and x are actually in the same memory location,


    Again wrong. The UB happens if 'x' is an expression involving the
    value of 'i' in some way. The snippet

    int b[10], *x = &b[5], i = 5;
    b = 42;
    b[i++] = *x;

    is perfectly well-defined; it stores 42 in 'b[5]', and then stores it
    there again, incrementing 'i' to 6 in the process. The snippet

    b[i++] = i;

    on the other hand, is undefined.

    -Arthur
     
    Arthur J. O'Dwyer, Sep 13, 2004
    #10
  11. On Mon, 13 Sep 2004 00:15:28 -0400 (EDT), "Arthur J. O'Dwyer"
    <> wrote:

    >
    > On Mon, 13 Sep 2004, Dave Thompson wrote:
    > > [ pre versus post inc/dec ]

    > Backwards explanation... <snip> ...right examples. <snip>


    > > except if b and x are actually in the same memory location,

    >
    > Again wrong. The UB happens if 'x' is an expression involving the
    > value of 'i' in some way. <snip>


    Gack! Sorry, I must have been sick that day or something.

    - David.Thompson1 at worldnet.att.net
     
    Dave Thompson, Sep 20, 2004
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    4
    Views:
    687
    Jim Moe
    Jul 3, 2006
  2. Torsten Mohr

    wrapping C++, how to wrap an object?

    Torsten Mohr, Mar 7, 2004, in forum: Python
    Replies:
    1
    Views:
    556
    Mike Thompson
    Mar 7, 2004
  3. Aaron Fude

    To wrap or not to wrap?

    Aaron Fude, May 8, 2008, in forum: Java
    Replies:
    12
    Views:
    737
    Chronic Philharmonic
    May 10, 2008
  4. Art Werschulz

    Text::Wrap::wrap difference

    Art Werschulz, Sep 22, 2003, in forum: Perl Misc
    Replies:
    0
    Views:
    269
    Art Werschulz
    Sep 22, 2003
  5. Art Werschulz

    Text::Wrap::wrap difference

    Art Werschulz, Sep 24, 2003, in forum: Perl Misc
    Replies:
    1
    Views:
    274
    Anno Siegel
    Sep 25, 2003
Loading...

Share This Page