Using Regular Expresions to change .htm to .php in files

S

sebzzz

Hi,

I have a bunch of files that have changed from standard htm files to
php files but all the links inside the site are now broken because
they point to the .htm files while they are now .php files.

Does anyone have an idea about how to do a simple script that changes
each .htm in a given file to a .php

Thanks a lot in advance
 
T

Tobiah

Hi,

I have a bunch of files that have changed from standard htm files to
php files but all the links inside the site are now broken because
they point to the .htm files while they are now .php files.

Does anyone have an idea about how to do a simple script that changes
each .htm in a given file to a .php

Thanks a lot in advance

#!/bin/bash

for each in *.php; do

sed "s/.htm/.php/g" < $each > /tmp/$$
mv /tmp/$$ $each
done
 
R

Ryan Ginstrom

On Behalf Of Mark
This line should be:

sed "s/\.htm$/.php/g" < $each > /tmp/$$

I think a more robust way to go about this would be:

(1) Use os.walk to walk through the directory
http://docs.python.org/lib/os-file-dir.html

(2) Use Beautiful Soup to extract the internal links from each file
http://crummy.com/software/BeautifulSoup/documentation.html

from BeautifulSoup import BeautifulSoup

soup = BeautifulSoup(doc)
links = soup('a')
internal_links = [link["href"]
for link in links
if link.has_key("href") and not
link["href"].startswith("http")]

(4) Do straight string replacements on those links (no regex needed)

(5) Save each html file to *.html.bak before changing


Regards,
Ryan Ginstrom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,434
Messages
2,571,685
Members
48,796
Latest member
Greg L.

Latest Threads

Top