function to url decode a string

  • Thread starter Ramprasad A Padmanabhan
  • Start date
R

Ramprasad A Padmanabhan

Hello,

Can anyone tell me where Can I find a function that can decode a url
encoded string

like
ram%40domain.tld ==> (e-mail address removed)

Thanks
Ram
ram 'at' netcore.co.in
 
W

Walt Fles

Ramprasad A Padmanabhan said:
Hello,

Can anyone tell me where Can I find a function that can decode a url
encoded string

like
ram%40domain.tld ==> (e-mail address removed)

Write one, isn't that what a programmer is being?
strip off the "%", then take the next 2 characters and convert them from
ascii-hex to ascii.
 
D

Default User

Ramprasad said:
Hello,

Can anyone tell me where Can I find a function that can decode a url
encoded string


This is a very common thing. Do a google search for cgi utilities
written in C.



Brian Rodenborn
 
M

Mike Wahler

Ramprasad A Padmanabhan said:
Hello,

Can anyone tell me where Can I find a function that can decode a url
encoded string

like
ram%40domain.tld ==> (e-mail address removed)


/* (no error checking included) */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void cvt(char *dest, const char *src)
{
const char *p = src;
char code[3] = {0};
unsigned long ascii = 0;
char *end = NULL;

while(*p)
{
if(*p == '%')
{
memcpy(code, ++p, 2);
ascii = strtoul(code, &end, 16);
*dest++ = (char)ascii;
p += 2;
}
else
*dest++ = *p++;
}
}

int main()
{
char in[] = "ram%40domain.tld";
char out[sizeof in] = {0};
cvt(out, in);
printf("in == %s\nout == %s\n", in, out);
return 0;
}

-Mike
 
R

Ramprasad A Padmanabhan

Michael said:
This function should be called 'bugtraq'.

Mike

Thanks , But I am not worried about any security risk in the function.
The function caller will do all the checking.

I am not a great programming expert and just a beginner with c But IMHO
it is not worth trying to trap all overflows in all functions and in the
end making the code very Heavy

If the this function is called *only* from my own script and I know
exactly what I am doing then i think it still will do

Thanks
Ram
 
M

Mike Wahler

Michael B Allen said:
This function should be called 'bugtraq'.

It's a non-production *example*. I included a caveat
about no error checking. Call it what you will.

-Mike
 
M

Morris Dovey

Ramprasad said:
> Can anyone tell me where Can I find a function that can decode
> a url encoded string like ram%40domain.tld ==> (e-mail address removed)

Ram...

The code at http://www.iedu.com/mrd/c/kvp.c contains a function
to do the decoding you want. I'm sure that you can find much more
(and possibly much better code) with a Google search.
 
J

James Antill

Thanks , But I am not worried about any security risk in the function.
The function caller will do all the checking.

This isn't likely IMO. Given an interface like the above, it's much hard
to check that the arguments are good to use.
Personally I'd recommend looking at a real string API
http://www.and.org/vstr/comparison.html ... the first on the list has
uri encode/decode functions.

If you want to pretend you don't need one then the libclc function
discussed a couple of days ago, in this very group, would be a better
starting point.
I am not a great programming expert and just a beginner with c But IMHO
it is not worth trying to trap all overflows in all functions and in the
end making the code very Heavy

This is a misconception due to your inexperience, stopping errors
_always_ needs to be done and if done properly doesn't make the code any
heavier.
 
M

Michael B Allen

If the this function is called *only* from my own script and I know
exactly what I am doing then i think it still will do

A URL is something that is almost invariably supplied by a user or suppied
by a program. In both cases unless *you* are always the one typing in the
URL your code must consider errnoeous input. Poor URL processing is a
favorite target of crackers.

The below code should be correct and safe although I have only tested
it with the one input.

Mike

--8<--

#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <ctype.h>
#include <stdio.h>

int
url_decode(const char *src, const char *slim, char *dst, char *dlim)
{
int state = 0, code;
char *start = dst;

if (dst >= dlim) {
return 0;
}
dlim--; /* ensure spot for '\0' */

while (src < slim && dst < dlim) {
switch (state) {
case 0:
if (*src == '%') {
state = 1;
} else {
*dst++ = *src;
}
break;
case 1:
code = *src - 48;
case 2:
if (isdigit(*src) == 0) {
errno = EILSEQ;
return -1;
}
if (state == 2) {
*dst++ = (code * 16) + *src - 48;
state = 0;
} else {
state = 2;
}
break;
}
src++;
}
*dst = '\0'; /* I'll be back */

return dst - start;
}

int main()
{
const char *src = "ram%40domain.tld/a/b/c%40d/%24%40%24abc";
char dst[1024];

if (url_decode(src, src + strlen(src), dst, dst + 1024) == -1) {
perror("url_decode");
return EXIT_FAILURE;
}
puts(src);
puts(dst);

return EXIT_SUCCESS;
}
 
R

Ramprasad A Padmanabhan

James said:
This isn't likely IMO. Given an interface like the above, it's much hard
to check that the arguments are good to use.
Personally I'd recommend looking at a real string API
http://www.and.org/vstr/comparison.html ... the first on the list has
uri encode/decode functions.

If you want to pretend you don't need one then the libclc function
discussed a couple of days ago, in this very group, would be a better
starting point.


This is a misconception due to your inexperience, stopping errors
_always_ needs to be done and if done properly doesn't make the code any
heavier.

Why not ? I just want to get my fundamentals clear and not argue that I
am right

If a string is used in function A() and within A() in B() and within B()
in C()
Then If I check the string( for some error condition ) I will do it only
only A() because B() and C() are not exposed at all directly

If I include the check in B() and in C() then there are more if's and
else's in my function then how can that be a better code

Ram
 
M

Michael B Allen

A URL is something that is almost invariably supplied by a user or
suppied by a program. In both cases unless *you* are always the one
typing in the URL your code must consider errnoeous input. Poor URL
processing is a favorite target of crackers.

The below code should be correct and safe although I have only tested it
with the one input.

And thus it did not convert all hex digits correctly. See the state
machine example on the homepage for an updated version.

http://www.ioplex.com/~miallen/

Mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top