Posted by Steve on Mon 21 Sep 2009 at 15:29
Adding searching facilities to websites makes it a lot easier for finding content. When sites are dynamically constructed it is often simple to update the code to perform the searching in the application, but for sites constructed of static pages using an indexer such as namazu can give you a great interface in very short space of time.
The namazu2 package will allow you to create an index of the contents of a local directory of content, and then search against that index - The package includes a handy CGI script which can be used by your site users for that purpose.To get started you'll need to install the packages:
root@skx:~# aptitude update root@skx:~# aptitude install namazu2 namazu2-index-tools
Indexing your content
Once you have the packages installed you'll need to create an index of your content. For this demonstration I've got a site located in the directory:
- /home/www/blog.example.org/htdocs/
I'm going to create a new directory to store the index, and call that /index/ - so we'll run the indexer like this:
root@skx:~# mkdir /home/www/blog.example.org/index root@skx:~# root@skx:~# mknmz --output-dir /home/www/blog.example.org/index/ \ /home/www/blog.example.org/htdocs/This will very quickly perform the indexing (the next time you run this you'll find it skips content which hasn't changed since the last time you ran it), and create a number of files in the index/ directory:
root@skx:~# cd /home/www/blog.example.org/index/ root@skx:~# ls -l | wc -l 64
Configuring the CGI Script
Now that you have your content indexed you can allow visitors to your website to actually use that index to search your site.
The search script is located in /usr/lib/cgi-bin/namazu.cgi so you'll need to ensure that this can be executed by your site. Or you can do what I do which is to create a symlink for your site:
root@skx:~# mkdir /home/www/blog.example.org/cgi-bin root@skx:~# ln -s /usr/lib/cgi-bin/namazu.cgi /home/www/blog.example.org/cgi-bin/namazu.cgiNow that you have the CGI script available you need to configure it to use the index that is present. To do this you need to create a file .namazurc in the same directory as the script.
The most basic file would look like this:
## ## Index: Specify the directory where the indexes are located. ## Index /home/www/blog.example.org/index ## Replace: Replace TARGET with REPLACEMENT in URIs in search ## results. Replace /home/www/blog.example.org/htdocs/ http://blog.example.org/With that done you should be able to point your browser at the search URL and enable it to work:
The results will be presented to the user via the stock templates, but these can be updated.
Customizing things
By default the search interface uses a number of template files from the location /usr/share/namazu/template/ but you can copy the files from that location somewhere else, edit them, and then point the CGI script at them.
To specify an alternative location edit the .namazurc file to include:
## ## Template: Set the template directory containing ## NMZ.{head,foot,body,tips,result} files. ## #Template /usr/share/namazu/index Template /home/www/blog.example.org/cgi-bin/templateHad you not installed your own .namazurc file the global one in /etc/namazu/namazurc would have been read - and that file contains default settings which you can view as a good example.
With the namazu2 package constructing a website search is a quick and painless operation, and by using a per-domain configuration file you can use the same script & templates across any number of sites.
This article can be found online at the Debian Administration website at the following bookmarkable URL (along with associated comments):
This article is copyright 2009 Steve - please ask for permission to republish or translate.