Bookmark URL Parsing

T

Timothy Wu

Hi,

I'm trying to parse FireFox bookmark files manually using regular
expressions. I tried to match key-value pairs in tags like the following:

matches = re.findall(r'(\S+)="(.+)"', text)

However, I find that if the URL I'm matching contains something
non-standard I may encounter a problem. For example, one of the link
content I have is javascript code and contains character %22(as shown
when opening with VI). I've figure out that %22 equals to the quotation
mark '"'. That interferes with my match.

How exactly does %22 maps to the quotation mark? I know I often see the
kind of representation in a URL, but what exactly is it and where would
I find info on that? And most importantly, how do I make my match work?

Timothy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,271
Latest member
BuyAtenaLabsCBD

Latest Threads

Top