This chapter explains how to use SWISH and WWWWAIS, stronghold's standalone indexing and search facilities, including how to
Stronghold comes with two standalone programs for indexing your site and offering search capabilities to users:
Stronghold's installation program places SWISH in ServerRoot/swish, and its configuration file is, ServerRoot/conf/swish.conf. Like httpd.conf, it's a simple text file, but does not use wrappers.
This section explains the SWISH configuration directives, including ReplaceRules and FileRules, the directives that set the indexing parameters.
sets the directory that SWISH indexes. usually, this is the same as the value you set for the
directive in httpd.conf. you can use more than one of these directives, although each can only take one value. For example, if the Web documents for your virtual hosts are in different directories that the main server documents, you can enter something like this:
IndexDir /usr/local/ww/htdocs
IndexDir /usr/local/ww/vhosts/vhost1
IndexDir /usr/local/ww/vhosts/vhost2
IndexDir /usr/local/ww/vhosts/vhost2
...
SWISH indexes each directory recursively
This directive sets the path to the index file, where SWISH saves the results of each indexing sweep. Be sure to include a.swish filename suffix.
specifies the types of file SWISH is allowed to index using their filename suffixes. For example, you can limit the index to HTML and PHP files like this:
IndexOnly .html .phtml .php
This directive is case-sensitive. If you omit it, SWISH indexes every file in the directory specified.Dir.
This directive sets the reporting option, which can be an integer from 0 through 3, 3 being the most verbose output option.
When FollowSymLink is set to "yes," SWISH follows symbolic links while indexing. When set to "no," SWISH ignores them.
SWISH can index entire files or only their filenames.
NoContents .ps .gif .au .hqx .xbm .mpg .mpeg .pict .jpg .jpeg ...
SWISH indexes only their filenames instead. this directive is case-sensitive.
Certain frequently-occurring words are irrelevant for indexing and searching purposes, such as prepositions, pronouns, articles, and indexicals. IgnoreWords sets the list of words that SWISH ignores. SWISH comes with a default list of several hundred common words, and it adds this list if one of the values for IgnoreWords is "SWISHDefault."
Along with IgnoreWords, this directive can save CPU resources and control the size of the index file.
IgnoreLimit, like IgnoreWords, provides a method of filtering out frequently-occurring words when indexing. Each IgnoreLimit directive takes two values:
For Example:
IgnoreLimit 80 256The first instance instructs SWISH to ignore words that occur in a least 80 percent of files and in at least 256 separate files. the second instance instructs SWISH to ignore words that occur in at least 50 percent of files and in at least 50 separate files.
IndexName sets the title of the index file.
IndexDescription is a short description of the index file, or the URL of a description file.
URL is the location of your site's home page.
IndexAdmin gives descriptive information about the administrator responsible for Web indexing. SWISH includes this information in the index file.
Since SWISH does not read the httpd.conf file, it knows nothing about the aliases and virtual hosts you've set up for your site.
ReplaceRules operates on the paths of the files that SWISH indexes, converting them into URL's. For example:
prepend "http://"
replace "/usr/local/httpd/htdocs/"
"www.mainhost.com/"
replace "/usr/local/httpd/vhost/vhostl/"
"www.vhostl.com
FileRules is the opposite of IndexOnly; instead of limiting the indexed files inclusively, it limits them exclusively. SWISH ignores all files that match the parameters you have set with this directive. Operator is one of the following:
Stonghold's installation program places WWWAIS in ServerRoot/cgi-bin, and its configuration file is ServerRoot/conf/wwwwais.conf. Like swish.conf, it's a simple text file that does not use containers.
This section explains the WWWWAIS configuration directives.
This sets the title of the search results file. If the quoted value is a string, only the string is used. If the value is a filename, WWWWAIS prepends the contents of the file to the search results.
This is the URL of the WWWWAIS search engine.
The integer value for MaxHits is the maximum number of search results WWWWAIS is allowed to return.
SwishBin sets the path to the SWISH indexing engine.
This directive sets the path and description of the SWISH index file.
Search results are pathnames from the index file generated by SWISH. In order to make this information useful to users, you can use SourceRule to modify the paths. For example:
SourceRules replace "/www/" "http://your.host.com/"
This converts document root paths to proper URLs that take advantage of the DocumentRoot alias.
This directive sets the source descriptions for WAIS sources. For WAISSEARCH sources, the syntax is
WaisSource hostname port path "description"This directive sets the location of your icons directory. WWWAIS uses the icons in this location if UseIcons is set to "yes."
TypeDef matches filename suffixes to filetype descriptions, icon files, and MIME types. WWWWAIS uses this information to generate and sort search response pages. Use as many of these directives as you need.
Each site on your server platform requires its own site index. For example, if you have many virtual hosts, users who access one host must be able to search that host without receiving results from other hosts. Before you can use the WWWWAIS search facility, you must use SWISH to create a site index for each virtual host on your server. You must also update these indexes periodically to ensure that search results are current. With a little creativity, you can create scripts that automate these tasks for you.
The SWISH executable is ServerRoot/swish/swish. To create a site index, run SWISH from the command line, using the following flags to specify options:
Once you have an index file for a host, any HTML search interface for that host must reference the appropriate index file. make sure the administrator of each virtual host has access to that host's index field, and the instructions contained in the next section, "Creating a Search Interface."
WWWWAIS searches the site indexes created by SWISh. It supports the Boolean operators AND and OR, and can use either GET or POST. The basic form that passes search parameters to WWWWAIS looks like this:
<FORM METHOD=GET
ACTION="/cgi-bin/wwwwais?sourcedir=/usr/local/www/htdocs/vhost/swish&source=index.swish"
Search for:
<INPUT TYPE=TEXT NAME="keywords" SIZE=40>
<INPUT TYPE=SUBMIT VALUE="Search">
</FORM>
You can customize any search form by appending search options to the URL in the ACTION field. you can use as many options as you like in a single form. Options strings must be separated from the ACTION path by a question mark (?). Options must be separated from each other by ampersands (&). For example:
ACTION=/cgi-bin/wwwais?source=index.src&keywords=sample+search
The rest of this section lists the available search options. Many of them are equivalent to certain configuration directives and can be used to override the settings in the configuration file. If you have multiple virtual hosts, it's important to specify the source index file in each search from using the sourcedir and source options. If you have only one host, you can simply set the source index in the configuration file.
This option specifies the directory that contains the index database. This setting is especially useful if you have several databases in the same directory, in which case WWWWAIS searches them all.
The source option specifies which index database the search engine should search.
This option sets the maximum number of URLs WWWWAIS returns to the user.
The keywords option specifies the keywords to search for.
The isindex option works identically to the keywords option.
This option sets the criteria that WWWWAIS uses to sort the results of a search.
If you are using WAISSEARCH as your search mechanism, you can use this option to specify a remote host to search instead of your local host.
If you are using WAISSEARCH and the host option, you can use this option to specify the port number to access on the remote host you are searching.
When the useicons option is set to "yes," WWWWAIS displays icons with the different files in the search results.
If you use the useicons option, use iconurl to specify the location of your icon files.
This option specifies a source description, which must be one of the values set for WaisSource or SwishSource in the wwwwais.conf file.
You can use searchprog to specify one of these alternative search programs:
Only SWISH comes with stronghold. If you want to use WAISQ or WAISSEARCH, you must install them separately.