K
Kieran Simkin
Hi All,
Just writing a quick function to remove HTML tags from a string (array of
chars) and I'd like your comments on my code - anything you'd do differently
or any mistakes etc. I'm still kinda new to C, so I'm not 100% confident
using pointers yet.
Anyway, the algorithm works like this: The loop steps over the string
character by character with two pointers, I have a toggle variable that
basically indicates whether the 's' pointer is currently within an HTML tag.
If this is the case, 's' is incremented, but 'c' isn't. If 's' isn't inside
an HTML tag, the value pointed to by 'c' is set to the value pointed to by
's' and both are incremented. So basically, the string is being rebuilt in
place, skipping over html tags and their content.
Here's the code, all comments welcome:
void striphtml (char *s) {
char *c,t=0;
c=s;
while (*s!='\0') {
if (*s=='<') {
t=1;
} else if (*s=='>') {
t=0;
} else if (!t) {
*(c++)=*s;
}
s++;
}
*c='\0';
}
Cheers.
~Kieran Simkin
Digital Crocus
http://digital-crocus.com/
Just writing a quick function to remove HTML tags from a string (array of
chars) and I'd like your comments on my code - anything you'd do differently
or any mistakes etc. I'm still kinda new to C, so I'm not 100% confident
using pointers yet.
Anyway, the algorithm works like this: The loop steps over the string
character by character with two pointers, I have a toggle variable that
basically indicates whether the 's' pointer is currently within an HTML tag.
If this is the case, 's' is incremented, but 'c' isn't. If 's' isn't inside
an HTML tag, the value pointed to by 'c' is set to the value pointed to by
's' and both are incremented. So basically, the string is being rebuilt in
place, skipping over html tags and their content.
Here's the code, all comments welcome:
void striphtml (char *s) {
char *c,t=0;
c=s;
while (*s!='\0') {
if (*s=='<') {
t=1;
} else if (*s=='>') {
t=0;
} else if (!t) {
*(c++)=*s;
}
s++;
}
*c='\0';
}
Cheers.
~Kieran Simkin
Digital Crocus
http://digital-crocus.com/