Remove extra blanks

L

lovecreatesbea...

Thad said:
No, initializations of auto variables are performed on encountering the
declarations in the control flow. I don't have H&S to refer to. Maybe
they meant that an initialization of static variables would only occur
once, not each invocation.

Thad, thanks for the clarification of the usage of the initialization
of auto variables.
I wouldn't do that, but would instead clearly document the parameter
requirements and return value. I would also add more to make the
program a little easier to read and use my own brace rules:

while ((*d++ = *s) != '\0') {
while (*s++ == ' ' && *s == ' ') n++;
}
return n;
}

The explicit comparison with '\0' makes the intent clearer and
eliminates a warning from some compilers. The unneeded braces is my own
rule to use braces if the statement following a for, while, if, or else
is on a following line (otherwise I have forgotten that there is no
brace and added another indented line below the original -- oops!).

Regards
 
C

CBFalconer

Registered said:
Richard said:
Joe Wright said:
CBFalconer wrote:
Registered User wrote:
I'm trying to write a program that replaces two or more consecutive
blanks in a string by a single blank.

Here's what I did:

#include <stdio.h>
#include <string.h>
#define MAX 80

int main()
{
char s[MAX];
int i, j;
fgets(s, MAX, stdin);
i=strlen(s);
while(i)
{
while (s[--i]!=' ' && i>0) /*Find the last space*/
;
j=i;
while (s[--j]==' ' && j>0) /*Go to the last non-space*/
; /*char before s*/

if (s[j]!=' ')
j++; /*Increment j so that s[j] is a space*/

if (j<i) /*If extra spaces have been found, remove*/
while (s) /*them by left shifting */
s[++j]=s[++i]; /*the characters on the right*/

i=j;
}
puts(s);

return 0;
}

The program works fine, but I have a feeling that I've made it
unnecessarily complicated.

Can anyone suggest ways on which I can improve upon the code? Is there
a better algorithm?
Try this. Notice the absence of string buffers. Should work until
you get over 32767 consecutive blanks.

#include <stdio.h>
int main(void)
{
int blanks, ch;
blanks = 0;
while (EOF != (ch = getchar())) {
if (' ' == ch) {
++blanks;
if (1 == blanks) putchar(' ');
}
else {
putchar(ch);
blanks = 0;
}
}
return 0;
}

Or maybe..

#include <stdio.h>
int main(void) {
int ch, last = 0;
while ((ch = getchar()) != EOF) {
if (!(ch == ' ' && last == ch))
putchar(ch);
last = ch;
}
return 0;
}


The problem is that I think the OP was only using fgets to get a
sample string - it is therefore not the excercise to simplify the
algorithm using getchar() or whatever.

His spec was to remove excess spaces from a string.


Yes, that's what I want to do: remove excess spaces from a given
string.


Oh very well. Try this:

#include <stdio.h>

/* ----------------- */

static void deblank(char *d, const char *s)
{
char last;

last = ' '; /* to elide all leading blanks, else use '\0' */
while (*s) {
if ((' ' != *s) || (' ' != last)) *d++ = *s;
last = *s++;
}
*d = '\0';
}

/* ----------------- */

int main(void)
{
char s[] = " A string with excess blanks.";
char d[sizeof s];

puts(s);
deblank(d, s);
puts(d);
return 0;
}
 
A

Andrew Poelstra

Oh very well. Try this:

#include <stdio.h>

/* ----------------- */

static void deblank(char *d, const char *s)
{
char last;

last = ' '; /* to elide all leading blanks, else use '\0' */
while (*s) {
if ((' ' != *s) || (' ' != last)) *d++ = *s;
last = *s++;
}
*d = '\0';
}

/* ----------------- */

int main(void)
{
char s[] = " A string with excess blanks.";
char d[sizeof s];

puts(s);
deblank(d, s);
puts(d);
return 0;
}

I think that a useful extension would be to have deblank() return the
number of characters stripped (to help the caller reallocate if
necessary.

I'm not sure who that's best left as an exercise to.
 
C

CBFalconer

Andrew said:
Oh very well. Try this:

#include <stdio.h>

/* ----------------- */

static void deblank(char *d, const char *s)
{
char last;

last = ' '; /* to elide all leading blanks, else use '\0' */
while (*s) {
if ((' ' != *s) || (' ' != last)) *d++ = *s;
last = *s++;
}
*d = '\0';
}

/* ----------------- */

int main(void)
{
char s[] = " A string with excess blanks.";
char d[sizeof s];

puts(s);
deblank(d, s);
puts(d);
return 0;
}

I think that a useful extension would be to have deblank() return
the number of characters stripped (to help the caller reallocate
if necessary.

In the interest of vertical economy and obfuscation, try:

*/ Remove multiple blanks, return strlen(d) */
static size_t deblank(char *d, const char *s)
{
char last, *p;

for (p = d, last = ' '; (*d = *s); last = *s++) {
if ((' ' != *s) || (' ' != last)) *d++;
}
return d - p;
}

This cat has been sufficiently skinned.
 
R

Rod Pemberton

Registered User said:
Richard said:
Richard said:
Hi experts,
I'm trying to write a program that replaces two or more consecutive
blanks in a string by a single blank.

Here's what I did:

#include <stdio.h>
#include <string.h>
#define MAX 80

int main()
{
char s[MAX];
int i, j;
fgets(s, MAX, stdin);
i=strlen(s);
while(i)
{
while (s[--i]!=' ' && i>0) /*Find the last space*/
;
j=i;
while (s[--j]==' ' && j>0) /*Go to the last non-space*/
; /*char before s*/

if (s[j]!=' ')
j++; /*Increment j so that s[j] is a space*/

if (j<i) /*If extra spaces have been found, remove*/
while (s) /*them by left shifting */
s[++j]=s[++i]; /*the characters on the right*/

i=j;
}
puts(s);

return 0;
}

The program works fine, but I have a feeling that I've made it
unnecessarily complicated.


Can anyone suggest ways on which I can improve upon the code? Is there
a better algorithm?


Much faster & efficient IMO (in most cases I would think) to
malloc a new string, copy the original one into it and then update the
original in place - no repeated shuffling.

strcpy(refCopy,refStr)
char *d=refStr; /*destination*/
char *s=refCopy; /*original string copy - source*/
while(*d++=(ch=*s++))
if(ch==' '){
while((ch=*s++)&&(ch==' ')); /*gobble up following spaces
if (!(*d++=ch)) /* store first non space */
break;
}

Not tested, but you will get the idea.


Thanks for the idea, Richard. I've implemented something similar:

#include <stdio.h>
#define MAX 80
int main()
{
char s[MAX];
int i, j;
fgets(s, MAX, stdin);
i=j=0;
while(s[i++]=s[j++])
{
if (s[i-1]==' ') /*If the last character copied was a space*/
{
while (s[j++]==' ') /*gobble up following spaces */
;
if ((s[i++]=s[j-1])=='\0')
break;
}
}
puts(s);
return 0;
}
Idiocy alert : you dont even need the copy thus saving fannying around
with mallocs etc. Just set s to be d. This is fine since s is always
the same or, after the first double space, ahead of the destination pointer.

Whoops.

Yeah, that was funny.


A variant for your perusal:

#include <stdio.h>
#define MAX 80

int main()
{
char s[MAX];
int i, j, k;
fgets(s, MAX, stdin);

for(i=0,j=0,k=0;s[j];i++,j++,k=j)
{
while(s[j]==' ')
j++;
if(k!=j)
j--;
s=s[j];
}
s='\0';

printf("|%s|\n",s);
return 0;
}


Rod Pemberton
 
R

Richard Heathfield

Rod Pemberton said:

A variant for your perusal:

#include <stdio.h>
#define MAX 80

int main()
{
char s[MAX];
int i, j, k;
fgets(s, MAX, stdin);

for(i=0,j=0,k=0;s[j];

Undefined behaviour if fgets returned NULL.
 
C

CBFalconer

Richard said:
Rod Pemberton said:

A variant for your perusal:

#include <stdio.h>
#define MAX 80

int main()
{
char s[MAX];
int i, j, k;
fgets(s, MAX, stdin);

for(i=0,j=0,k=0;s[j];

Undefined behaviour if fgets returned NULL.

I repeat my earlier suggestion. In the interest of vertical
economy and obfuscation, try:

/* Remove multiple blanks, return strlen(d) */
static size_t deblank(char *d, const char *s)
{
char last, *p;

for (p = d, last = ' '; (*d = *s); last = *s++)
if ((' ' != *s) || (' ' != last)) *d++;
return d - p;
}
 
R

Richard Heathfield

CBFalconer said:

/* Remove multiple blanks, return strlen(d) */
static size_t deblank(char *d, const char *s)
{
char last, *p;

for (p = d, last = ' '; (*d = *s); last = *s++)
if ((' ' != *s) || (' ' != last)) *d++;
return d - p;
}

I recommend accepting size_t len, the maximum number of bytes that can
legitimately be written to the memory whose start is pointed to by d.
 
C

CBFalconer

Richard said:
CBFalconer said:



I recommend accepting size_t len, the maximum number of bytes
that can legitimately be written to the memory whose start is
pointed to by d.

Huh? I do NOT understand the comment. It returns size_t.
 
G

Guest

CBFalconer said:
Huh? I do NOT understand the comment. It returns size_t.

I think it's intended to avoid buffer overflows (fgets() vs. gets()).
You may not really need it, though, since no more than strlen(s)+1
characters can ever be written, and that much memory should be
available when the function is called.
 
R

Richard Heathfield

CBFalconer said:
Huh? I do NOT understand the comment. It returns size_t.

By the time the function returns, it's a little late to worry about whether
it overwrote memory the program doesn't own.
 
A

Andrew Poelstra

Rod Pemberton said:

A variant for your perusal:

#include <stdio.h>
#define MAX 80

int main()
{
char s[MAX];
int i, j, k;
fgets(s, MAX, stdin);

for(i=0,j=0,k=0;s[j];

Undefined behaviour if fgets returned NULL.

Why?

I can see how the /contents/ of s[] could be undefined, but I'm
clearly missing something.
 
C

CBFalconer

Richard said:
CBFalconer said:

By the time the function returns, it's a little late to worry
about whether it overwrote memory the program doesn't own.

I see what you are getting at. In the original use the caller
created the destination using sizeof(source), which guaranteed
sufficient space, and allowed a string constant. How about:

/* Remove multiple blanks in place, return strlen(src) */
static size_t deblank(char *src)
{
char last, *p, *s;

for (p = s = src, last = ' '; (*p = *s); last = *s++)
if ((' ' != *s) || (' ' != last)) p++;
return p - src;
}

I still think it is nicely obfuscated.
 
C

CBFalconer

Andrew said:
Rod Pemberton said:

A variant for your perusal:

#include <stdio.h>
#define MAX 80

int main()
{
char s[MAX];
int i, j, k;
fgets(s, MAX, stdin);

for(i=0,j=0,k=0;s[j];

Undefined behaviour if fgets returned NULL.

Why? I can see how the /contents/ of s[] could be undefined, but
I'm clearly missing something.

Because if fgets fails it writes nothing into the uninitialized
s[], which therefore may not have a terminating '\0', and not be a
valid string. From N869:

[#3] The fgets function returns s if successful. If end-of-
file is encountered and no characters have been read into
the array, the contents of the array remain unchanged and a
null pointer is returned. If a read error occurs during the
operation, the array contents are indeterminate and a null
pointer is returned.

Note that initializing s to an empty string will not suffice,
because of the read error provision.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,479
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top