You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
138 lines
5.7 KiB
138 lines
5.7 KiB
3 years ago
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
|
||
|
<html>
|
||
|
<head>
|
||
|
<title>
|
||
|
Running ht://Dig
|
||
|
</title>
|
||
|
</head>
|
||
|
<body bgcolor="#eef7ff">
|
||
|
<h1>
|
||
|
Running ht://Dig
|
||
|
</h1>
|
||
|
<p>
|
||
|
ht://Dig Copyright © 1995-2004 <a href="THANKS.html">The ht://Dig Group</a><br>
|
||
|
Please see the file <a href="COPYING">COPYING</a> for
|
||
|
license information.
|
||
|
</p>
|
||
|
<hr size="4" noshade>
|
||
|
<p>
|
||
|
This document will attempt to show the steps needed to use
|
||
|
the ht://Dig system, after <a href="where.html">obtaining</a>,
|
||
|
<a href="install.html">installing</a> and
|
||
|
<a href="config.html">configuring</a> it.<br>
|
||
|
The main sections are:
|
||
|
</p>
|
||
|
<ul>
|
||
|
<li>
|
||
|
<a href="#rundig">Building the databases</a>
|
||
|
</li>
|
||
|
<li>
|
||
|
<a href="#testing">Testing and troubleshooting</a>
|
||
|
</li>
|
||
|
<li>
|
||
|
<a href="#maintenance">Maintaining the system</a>
|
||
|
</li>
|
||
|
</ul>
|
||
|
<hr noshade>
|
||
|
<h2>
|
||
|
<a name="rundig">Building the databases</a>
|
||
|
</h2>
|
||
|
<p>
|
||
|
After setting up all the <a href="config.html">configuration
|
||
|
files</a>, you can build the required databases simply by running
|
||
|
<a href="rundig.html">rundig</a>. This script will run
|
||
|
<a href="htdig.html">htdig</a> first to build the initial database,
|
||
|
then it runs <a href="htpurge.html">htpurge</a> to clean up the
|
||
|
document and word databases that were created by htdig.
|
||
|
It then runs <a href="htnotify.html">htnotify</a>, and finally
|
||
|
runs <a href="htfuzzy.html">htfuzzy</a> if necessary, to build
|
||
|
the endings and synonyms databases if they're missing or outdated.
|
||
|
The rundig script can be customized for your specific needs, or
|
||
|
you can develop your own script that runs any of these programs.
|
||
|
Read the reference sections for each of these programs to get a
|
||
|
better understanding of what each one does.
|
||
|
</p>
|
||
|
<p>
|
||
|
The <a href="htfuzzy.html">htfuzzy</a> program deserves a bit more
|
||
|
explaining. It is used to build databases that are used by some
|
||
|
of the fuzzy match algorithms selected by
|
||
|
<a href="htsearch.html" target="_top">htsearch</a>'s
|
||
|
<a href="attrs.html#search_algorithm">search_algorithm</a>
|
||
|
attribute. The <em>endings</em> and <em>synonyms</em> algorithms
|
||
|
use static dictionaries, so their databases only need to be rebuilt
|
||
|
by htfuzzy when the dictionary files are changed, or when ht://Dig
|
||
|
is initially installed. The rundig script handles the building of
|
||
|
these two databases as needed for the default setup. A few of the
|
||
|
other fuzzy match algorithms use databases that are derived from
|
||
|
the word database built by htdig/htpurge, so if you use these
|
||
|
algorithms you should rebuild their databases with htfuzzy every
|
||
|
time you update your index. This isn't done in rundig, but the
|
||
|
comments in the script show where you can add your htfuzzy commands
|
||
|
as needed. Some fuzzy match algorithms don't need their own
|
||
|
database, as they just operate on the word database, so they don't
|
||
|
need any special setup.
|
||
|
</p>
|
||
|
<hr noshade>
|
||
|
<h2>
|
||
|
<a name="testing">Testing and troubleshooting</a>
|
||
|
</h2>
|
||
|
<p>
|
||
|
Once the databases are built, you should test out htsearch.
|
||
|
It's recommended that you first try a few queries running
|
||
|
htsearch on the command line, as it helps to separate problems
|
||
|
that are specific to ht://Dig from web server or CGI problems.
|
||
|
Once you have that working, try running htsearch from your web
|
||
|
browser, using the search form you configured.
|
||
|
</p>
|
||
|
<p>
|
||
|
If you run into problems at any point in the building and testing
|
||
|
of your databases, there are many things you can do. All ht://Dig
|
||
|
programs feature a <strong>-v</strong> option to get some debugging
|
||
|
output. The more of these options you put on the command line, the
|
||
|
more output you'll usually get. To get help with common problems,
|
||
|
or with interpreting some of the debugging output, please look to
|
||
|
the ht://Dig <a href="FAQ.html">FAQ</a> (frequently asked questions)
|
||
|
as your first line of support. Most of the problems that ht://Dig
|
||
|
users have are explained there, and the on-line
|
||
|
<a href="http://www.htdig.org/FAQ.html">FAQ on the website</a> is
|
||
|
updated frequently as new problems arise. The FAQ will also tell
|
||
|
you where you can turn if your question isn't answered there.
|
||
|
Remember that questions may not be phrased exactly as you'd state
|
||
|
them, so look carefully for anything that seems similar to the
|
||
|
problem you're trying to solve.
|
||
|
</p>
|
||
|
<hr noshade>
|
||
|
<h2>
|
||
|
<a name="maintenance">Maintaining the system</a>
|
||
|
</h2>
|
||
|
<p>
|
||
|
Once everything is running, you have to deal with the question of
|
||
|
how you can keep everything running and up to date. The databases
|
||
|
don't automatically update themselves, of course, so you'll need
|
||
|
to figure out how to schedule automatic updates of the database.
|
||
|
Most users use the <strong>crontab</strong> facility on their
|
||
|
systems to schedule daily or weekly updates of their database.
|
||
|
This can be as simple as running "rundig" or "rundig -a" from
|
||
|
your crontab, or from a file in /etc/cron.daily if your system
|
||
|
uses this, to rebuild from scratch every night. For a small site,
|
||
|
this may take only a few minutes to run. Other sites will run
|
||
|
more elaborate update scripts, to update their existing databases
|
||
|
nightly, and schedule complete rebuilds less frequently, such as
|
||
|
monthly.
|
||
|
</p>
|
||
|
<p>
|
||
|
You need to pay close attention to how long updates take to run.
|
||
|
There are no database lockouts in ht://Dig, so you don't want to
|
||
|
schedule update or reindexing runs so frequently that they run
|
||
|
into each other.
|
||
|
</p>
|
||
|
<hr size="4" noshade>
|
||
|
|
||
|
Last modified: $Date: 2004/05/28 13:15:19 $
|
||
|
<br>
|
||
|
<a href="http://sourceforge.net/">
|
||
|
<img src="http://sourceforge.net/sflogo.php?group_id=4593&type=1" width="88" height="31" border="0" alt="SourceForge Logo"></a>
|
||
|
|
||
|
</body>
|
||
|
</html>
|