WikiSlurp

Last edit September 16, 2003

wiki-slurp is a simple PythonLanguage script that will save a remote wiki site as local files. The links are translated as well. See http://www.cs.uu.nl/~hanwen/public/software/wiki-slurp.py.

Why not just ask for a copy of the database? Seems like it would cause a lot less load on the wiki server. The admin would love you for that.

Next irritating question: what's the difference between slurping a normal Web site and slurping a wiki? They're both a forest of descending links, right?

They'd have to be if you're pulling from the outside. It seems to me that a slurp rather than a copy of the DB would have the unfortunate result of excluding orphaned pages.

For CategorizedRecentChanges, I wrote a script that downloaded every page on RecentChanges and then every change since the last fetch every five minutes. The bandwidth and server load it took was very high even after optimizing it. I would not recommend doing such a thing frequently. -- SunirShah

Wiki has several stages of protection from denial of service attacks that can be and have been triggered by such programs. See WikiMirrors for a much more site friendly approach bulk downloads. -- WardCunningham

I wrote wiki-slurp because it was so simple, and I needed the functionality. Its original purpose was to see how difficult it would be to convert the wiki site that I run with a close friend to local HTML. I haven't tested it in daily use.

Perhaps it would be wiser to have the site run the script locally and let me mirror a .tar.gz of the Web pages.

Why not .tar.bz2?

The problem I see with copying the DB itself, is that I would have to set up and run the same WikiEngine as the original site itself (isn't that right?), which at the moment is not very convenient, and also feels like DuplicateFunctionality (or whatever it is called in the repository). -- HanWenNienhuys