Crawler prevention?

5 September 2009, 10:43

Hiawatha version: 6.17.1
Operating System: Ubuntu 9.04 Server

I wonder to know if Hiawatha comes with crawler prevention in order to prevent crackers to download my whole site for further study and crack.
Hugo Leisink
5 September 2009, 11:06
If you don't want people to download stuff, don't put it online. And if downloading and studying information on your website leads to a hack, you should fix your website. Hiawatha doesn't prevent vulnerable website from being hacked. No webserver can do that.
5 September 2009, 11:37
Hugo Leisink,

I agree with your point. However, most CMSes store their SQL username and password inside a file or php. If crawler fetch these files, the site may be cracked.

I am using XOOPS (a kind of CMS), which comes with a module namely "Protector", can prevent from the crawler. I wonder if this feature can be done by web server or not. It is because I want to use CMS that other than XOOPS. Can Banshee do that?

Hugo Leisink
5 September 2009, 11:43
SQL username and passwords inside PHP file shouldn't be a problem, because the source is not uploaded. Only the output of the script. So, if the script doesn't print the username and password, there is no problem.

A text file inside the website directory containing passwords are dangerous. Never ever place those files inside the website root directory. Never! Banshee is designed to only have a single index.php and the images/javascripts/css inside the website root directory (the 'public' directory). The rest is safely stored below that directory.

Hiawatha can help you preventing access to those files. You can do that via the UrlToolkit:
UrlToolkit {
ToolkitID = protect_files
Match ^/path/to/file.txt DenyAccess
Match ^/path/to/directory/ DenyAccess

VirtualHost {
UseToolkit = protect_files

But this should only be seen as a work around for badly designed websites.
This topic has been closed.