![]() This way you preserve even more of the former site structure and make sure that "/" requests will not fail. If you do this, you can change your DOCUMENT_ROOT from to /. You can even copy //index/index.html file to //index.html, where in the latter, all './' must be removed from the source. name "*.html" -type f -print0 | xargs -0 perl -i -pe "s|\.\./index/|/|g"įor more clean up tips, see Drupal on Mothballs - Convert Drupal 6 or 7 sites to static HTML. The final task is to search and replace the link to the root index.html in all files, by updating the string href="./index/" to simply href="/": find. ![]() from paths in that file: mv index/index.html. You also need to fix the root index.html by copying /index/index.html to /index.html, and remove. Use for example DDEV or Lando to quickly spin up a LAMP instance to test it with: lando init -recipe lamp -name lamp -source cwd -webroot. Note that this only works with a web server configured to add the necessary trailing slashes again and resolve to the actual index.html file. This way it would leave Drupal's non-trailing-space paradigma intact and avoid "duplicate content" issues while preserving absolute paths. name "*.html" -type f -print0 | xargs -0 perl -i -pe "s/\/index.html/\//g" In the latter, she furtherly suggests to run a regex on all files to fix link issues with index.html: find. (LOCALSITE is the URL of the site that's being copied, and DESTINATION is the path to the directory where the static pages should go) KarenS has created a very helpful description on how to "statify" a Drupal site with httrack, where she suggests the following code (on a Linux console): httrack -O DESTINATION -N "%h%p/%n/index%.%t" -WqQ%v -robots=0 -footer '' If you're working from a local installation of Drupal and want to grab ALL of your files in a way that you can just copy them up to a server, try the following command: httrack -W -O "~/static_cache" -%v -robots=0 However, with the default robots.txt settings in Drupal 5 and the "good citizen" default HTTrack settings, you won't get any module or theme CSS files or JavaScript files. With HTTrack properly configured, you don't have to hack on common.inc to get all of your stylesheets to work correctly. The default is -c10, so you might considering something more like this value when archiving your own site. The -c1 options makes only 1 request at a time so this becomes rather slow. Otherwise omit -K to produce relative links Note the -K option creates absolute links - this is only sometimes useful if you are hosting a public mirror on the same domain. One potential command to use is: httrack -K -w -O. The Windows GUI client version will produce the mirror with almost no configuration on your part. HTTrack (UNIX and Windows and Mac/homebrew) Recursively remove all query strings with: find -name "*.*\?*" | while read filename do mv "$filename" "$" done Wget includes all query strings such as image file "?itok=qRoiFlnG". To disable this, include the option -e robots=off in your command line. Wget respects the robots.txt files, so might not download some of the files in /sites/ or elsewhere. Modify your theme template to produce hardcoded absolute links to the stylesheets and try the following command: wget -q -mirror -p -adjust-extension -e robots=off -base=./ -k -P. However, wget seems to have problems converting the relative style sheet URLs properly with many Drupal site pages. Wget is generally available on almost any 'nix machine and can produce the mirror from the command line. It can also be a good idea to disable any third party dynamically generated blocks once the site is archived, it would be difficult to remove these blocks if the 3rd party services are no longer available.Ĭreate a static clone Wget (UNIX, Linux, OSX.You can do this through phpMyAdmin by running the following SQL command from the node table:ĭrush sql:query "UPDATE node_revision SET comment = '1' " This will eliminate the login or register to post comments link that would otherwise accompany each of your posts. Update all nodes by setting their comments to read only.Disable ajax requests such as views pagers.comment controls which allow the user to select comment display format.links to the search module and/or any search boxes in the header.Use the Disable All Forms module to disable all forms. if you are archiving a 2008 event, link to the URL of the next event).ĭisable interactive elements which will be nonfunctional in the static HTML version. Consider including a link to the future versions of the site (e.g. Be sure to include the date of the archiving. Prepare the Drupal websiteĬreate a custom block and/or post a node to the front page that notes that the site has been archived from Drupal to static HTML. ![]() Note: You should certainly only use this on your own sites. How to produce a static mirror of a Drupal website? ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |