Convert WP to static HTML – part 2
This is a followup to this previous post.
So I’ve been converting some more blogs to static html files, and this time around things seemed to be so different, that I made up a new how to. Here are the steps that I’ve been using to convert blogs using the default Kubric theme.
- Update the permalink structure for the site so that it uses the year, month, day, postname structure.
UPDATE `database`.`prefix_options` SET `option_value` = ‘/%year%/%monthnum%/%day%/%postname%/’ WHERE `prefix_options`.`option_name` = ‘permalink_structure’ LIMIT 1 ;
- Make sure the blog does not block search engines. If the blog is set to block them, wget can only download the index.html file. And this took me a while to figure out. So, for the sake of search engines, if wget only downloads the index.html file or wget recursive gets only index.html file, then remember to check your robots.txt or similar settings. Either edit in the admin section (under Settings->Privacy) or via SQL.
UPDATE `database`.`prefix_options` SET `option_value` = ’1′ WHERE `prefix_options`.`option_name` = ‘blog_public’ LIMIT 1 ;
- Add the .htaccess file if not already there, where
/path/to/wordpress/blog/
starts at the URL root, not the absolute file path. So http://sitename.com/path/to/wordpress/blog/ would have the .htaccess file below in the ‘blog’ directory.
RewriteEngine On RewriteBase /path/to/wordpress/blog/ RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /path/to/wordpress/blog/index.php [L] - Get rid of the meta links through the sidebar widget in the admin, or delete the appropriate lines from the theme files (for default Kubric theme edit comments.php, sidebar.php, single.php, footer.php), or see the last step. Delete the code that puts in the search, comments, trackback, rss, and anything in the footer you want out.
- When all is good, run wget to grab the files.
wget --mirror -P blog-static -nH -np -p -k -E --cut-dirs=5 http://sitename.com/blog/
- Rename the blog directory.
mv blog blog-old
- Rename the static directory to be live.
mv blog-static blog
- Copy the images directory from the old theme to the appropriate static directory.
cp -r blog-old/wordpress/wp-content/themes/default/images/ blog/wordpress/wp-content/themes/default/
- Alternative to get rid of unwanted links, etc. Use the find command to find all html files, then use perl to delete the lines. Don’t forget to escape forward slashes in the search field. Unfortunately, this method requires you to do it for every line of code you want to delete. It’s much better to delete the lines out of the theme files. The code below has an unnecessary space in the opening H3 tag so it will render properly.
find . -name \*.html | xargs perl -ni -e 'print unless /< h3>Leave a Reply< \/h3>/'
Also, if you want to just search and replace instead of remove, this handy find and perl one-liner will find and replace text in all html files.
find . -name *.html | xargs perl -p -i'' -e "s/search text here/replace text there/"
The above would search for all the “search text here” phrases in all html files, and replace it with “replace text here”. You can obviously substitute whatever you want in those to places. If you have a ‘/’ (forward slash) character, it will need to be escaped with a ‘\’ (back slash) character. Perl uses the regular regular expression syntax, so look that up if you need help formulating a search and replace structure.
7 Comments to Convert WP to static HTML – part 2
Nice write-up, Ammon, thanks! Does this process also take care of URL redirects to the new .html?
Also, why do you have to use the default theme? Can you do this so it uses the theme the blog had set prior to converting to HTML? I noticed when you did this to my Hist 120 class blog a few years ago, it reverted to the default theme instead of my custom class feed, which had some additional content that wasn’t in the database. Not complaining about that, just curious!
Yeah, the process takes care of the URL redirects to the new .html pages too. There’s usually no clean up of the files after the wget process.
Most of the blogs I’ve done so far used the default theme already, so I just stuck with that. It should work out fine with custom themes as well, since it just grabs what’s there (style sheets, images and all).
Sorry about the missing content. Your blog unfortunately suffered from being one of the guinea pigs. I think I finally have this process down pretty good now.
August 3, 2009
[...] UPDATE: Check out the new post on a better way to do this here: http://historicalwebber.mossiso.com/convert-wp-to-static-html-part-2-244.html [...]
Added another find/perl one liner to find and replace text in an html file.
Also check out the script that should convert a site for you: http://historicalwebber.mossiso.com/code/make-wordpress-static
August 17, 2010
hello,
First I want to thanks you for writing the script above for everyone, it is so helpful especially for beginners like me.
sadly, I dont know how to run the script after uploading it to where wp-config.php is. Do I run it with MySQL? or using somthing else, Thanks!
Hi tommy,
Once you have the script on your server, you can run it from the command line. You’ll need to use a terminal (like PuTTy, if you’re using Windows). Make sure the permissions on the script are set so that your user has execute permissions (it would need to be 766, or rwxrwrw- if you do a ‘ls -lh’ to view the permissions). Then you’ll need to execute the script like so ‘./wpstatic’.
Hope that helps.
Leave a comment
Search
Categories
Recent Comments
Recent Posts
- History’s equation
- The paper is done.
- Some more changes to the project.
- Gathering the historiography
- Getting my hands dirty
- Switching topics
- Archival Research
- The Mystery of Scholarly Articles Revealed
- The review of the historiographical essay
- Aaarg – finding an historiographical essay
- Changing plans already
- Graduate Research Paper
- Poster Session at the History of Ed
- Multiple PHP Instances With One Apache
- 40th anniversary of the moon landing
July 21, 2009