problem with regex

D

dimmaim

i want to find a specific urls from a txt file but i have some issus. Firstwhen i take just two lines from the file with copy paste and assign it to a variable like this and it works only with triple quotes

test='''_*_n.jpg","timelineCoverPhoto":"{\"focus\":{\"x\":0.5,\"y\":0.386925795053},\"photo\":{\"__type__\":{\"name\":\"Photo\"},\"image_lowres\":{\"uri\":\"https://fbcdn-photos-f-a.akamaihd.net/*-*-*/*_*_*_a.jpg\",\"width\":180,\"height\":179}}}","subscribeStatus":"IS_SUBSCRIBED","smallPictureUrl":"https://fbcdn-profile-a.akamaihd.net/*-*-*/s100x100/*_*_*_s.jpg","contactId":"*==","contactType":"USER","friendshipStatus":"ARE_FRIENDS","graphApiWriteId":"contact_*:*:*","hugePictureUrl":"https://fbcdn-profile-a.akamaihd.net/hprofile-ak-frc3/*_*_*_n.jpg","profileFbid":"*","isMobilePushable":"NO","lookupKey":null,"name":{"displayName":"* *","firstName":"*","lastName":"*"},"nameSearchTokens":["*","*"],"phones":[],"phoneticName":{"displayName":null,"firstName":null,"lastName":null},"isMemorialized":false,"communicationRank":1.1144714,"canViewerSendGift":false,"canMessage":true}
*=={"bigPictureUrl":"https://fbcdn-profile-a.akamaihd.net/hprofile-ak-ash3/*.*.*.*/s200x200/*_*_*_n.jpg","timelineCoverPhoto":"{\"focus\":{\"x\":0..5,\"y\":0.49137931034483},\"photo\":{\"__type__\":{\"name\":\"Photo\"},\"image_lowres\":{\"uri\":\"https://fbcdn-photos-h-a.akamaihd.net/*-*-*/*_*_*_a.jpg\",\"width\":180,\"height\":135}}}","subscribeStatus":"IS_SUBSCRIBED","smallPictureUrl":"https://fbcdn-profile-a.akamaihd.net/*-*-*/*.*.*.*/s100x100/*_*_*_a.jpg","contactId":"*==","contactType":"USER","friendshipStatus":"ARE_FRIENDS","graphApiWriteId":"contact_*:*:*","hugePictureUrl":"https://fbcdn-profile-a.akamaihd.net/hprofile-ak-ash3/c0.0.540.540/*_*_*_n.jpg","profileFbid":"*","isMobilePushable":"YES","lookupKey":null,"name":{"displayName":"* *","firstName":"*","lastName":"*"},"nameSearchTokens":["*","*"],"phones":[],"phoneticName":{"displayName":null,"firstName":null,"lastName":null},"isMemorialized":false,"communicationRank":1.2158813,"canViewerSendGift":false,"canMessage":true}'''

uri = re.findall(r'''uri\":\"https://fbcdn-(a-z|photos)?([^\'" >]+)''',test)
print uri

it works fine and i have my result [('photos', '-f-a.akamaihd.net/*-*-*/*_*_*_a.jpg'), ('photos', '-h-a.akamaihd.net/*-*-*/*_*_*_a.jpg')]

but if a take those lines and save it into a txt file like the original is without the quotes and do the following

datafile=open('a.txt','r')
data_array=''
for line in datafile:
data_array=data_array+line

uri = re.findall(r'''uri\":\"https://fbcdn-(a-z|photos)?([^\'" >]+)''',data_array)

after printing uri it gives an empty list,. what to do to make it work for the lines of a txt file
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top