base64.urlsafe_b64encode and the equal character

C

Clodoaldo

I'm using a md5 hash encoded with base64.urlsafe_b64encode as a
parameter of a URL used to confirm a registration in a site. It has
been working great.

The url is like this:

http://example.com/ce?i=878&h=kTfWSUaby5sBu9bIfoR87Q==

Now i need to match that URL in a certain text and i realized that
urlsafe_b64encode uses the "=" character so i can't just use \w{24} to
match the parameter.

What i need to know is where can an equal char appear in a
urlsafe_b64encoded string?:

a)only at end;
b)both at the end and at the begginig;
c)anywhere in the string;

A sure answer will make my regexp safer.

In another point, does the "=" char make it urlunsafe? I guess not
because i believe it would only be unsafe if the equal appeared like
in "&var=" and since there are no "&" in the string than there is no
problem right? Or wrong?

Regards, Clodoaldo Pinto Neto
 
G

Gabriel Genellina

I'm using a md5 hash encoded with base64.urlsafe_b64encode as a
parameter of a URL used to confirm a registration in a site. It has
been working great.

The url is like this:

http://example.com/ce?i=878&h=kTfWSUaby5sBu9bIfoR87Q==

Now i need to match that URL in a certain text and i realized that
urlsafe_b64encode uses the "=" character so i can't just use \w{24} to
match the parameter.

What i need to know is where can an equal char appear in a
urlsafe_b64encoded string?:

a)only at end;
b)both at the end and at the begginig;
c)anywhere in the string;

A sure answer will make my regexp safer.

Only at the end. The encoded string has 4*n chars when the input string
has 3*n chars; when the input length is 3*n+1 or 3*n+2, the output has
4*(n+1) chars right padded with 2 or 1 "=" chars.
If your input has 3n chars, the output won't have any "="
In another point, does the "=" char make it urlunsafe? I guess not
because i believe it would only be unsafe if the equal appeared like
in "&var=" and since there are no "&" in the string than there is no
problem right? Or wrong?

I guess not, but you should check the relevant RFCs. Or at least check
that your server can always parse the request.
 
C

Clodoaldo

En Fri, 28 Mar 2008 10:54:49 -0300, Clodoaldo <[email protected]>
escribió:








Only at the end. The encoded string has 4*n chars when the input string
has 3*n chars; when the input length is 3*n+1 or 3*n+2, the output has
4*(n+1) chars right padded with 2 or 1 "=" chars.
If your input has 3n chars, the output won't have any "="

Thanks. But I'm not sure i get it. What is n?

A md5 digest will always be 16 bytes length. So if i understand it
correctly (not sure) the output will always be 22 chars plus two
trailing equal chars. Right?

Regards, Clodoaldo Pinto Neto
 
G

Gabriel Genellina

Thanks. But I'm not sure i get it. What is n?

(Any nonnegative integer...)
I mean: For base64 encoding, the length of the output depends solely of
the length of the input. If the input string length is a multiple of 3,
the output length is a multiple of 4, with no "=". If the input length is
one more than a multiple of 3, the output has two "==" at the end. If the
input length is two more than a multiple of 3, the output has only one "="
at the end. In all cases, the output length is a multiple of 4.

[base64 uses 64=2**6 characters so it encodes 6 bits per character; to
encode 3 bytes=3*8=24 bits one requires 24/6=4 characters]
A md5 digest will always be 16 bytes length. So if i understand it
correctly (not sure) the output will always be 22 chars plus two
trailing equal chars. Right?

Exactly.
 
C

Clodoaldo

En Fri, 28 Mar 2008 13:22:06 -0300, Clodoaldo <[email protected]>
escribió:


Thanks. But I'm not sure i get it. What is n?

(Any nonnegative integer...)
I mean: For base64 encoding, the length of the output depends solely of
the length of the input. If the input string length is a multiple of 3,
the output length is a multiple of 4, with no "=". If the input length is
one more than a multiple of 3, the output has two "==" at the end. If the
input length is two more than a multiple of 3, the output has only one "="
at the end. In all cases, the output length is a multiple of 4.

[base64 uses 64=2**6 characters so it encodes 6 bits per character; to
encode 3 bytes=3*8=24 bits one requires 24/6=4 characters]
A md5 digest will always be 16 bytes length. So if i understand it
correctly (not sure) the output will always be 22 chars plus two
trailing equal chars. Right?

Exactly.

Thank you. That was great support!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top