Last updated | 2002-05-17 21:51:35 EDT |
Doc Title | XPAT Database Maintenance |
Author 1 | Wilkin, John Price |
CVS Revision | $Revision: 1.8 $ |
[Editor's note: This text is in the process of being adapted from the original Open Text manual, chapter 14 in the DBA section. References to sections with a "14" prefix are internal to this document. The original document has a heavy emphasis on MFS index building, which has not yet been corrected, and on "dbbuild", which DLXS does not support or recommend. This text was drawn from OCR, and so many errors exist, and figures are typically no longer meaningful.]
This chapter discusses the xpatmaint program that provides update capability for XPAT databases. This chapter is divided into four sections. Section 14.1 introduces the xpatmaint database maintenance program; Section 14.2 illustrates the various update operations that xpatmaint can perform, using a simple database; Section 14.3 discusses xpatmaint's operational characteristics and statistics; and Section 14.4 describes how to install and use the Customer Interaction Database, an example of a complex record-oriented database that is supplied with all DLXS XPAT software distributions.
The xpatmaint utility is a database maintenance program that provides facilities for adding text to, and deleting text from, XPAT databases. The xpatmaint program adds text to XPAT databases by appending a relatively small piece of text (called the append text) to the end of a main database. It deletes text from XPAT databases by deleting any sections of the main database that are no longer needed (called deletion regions).
The xpatmaint utility performs an update by updating the text file, then the Main Index file, the Region Subindex files, Fast Find index files and Fast Region index files. Most of these XPAT database files are updated in-place. For this reason, updates are batch oriented. Finally, xpatmaint uses an algorithm that efficiently computes the exact modifications that it has to perform on each file. This algorithm is faster than updating the text file manually and completely rebuilding the XPAT database with each of the index building tools - xpatbld, multirgn, xpatrgn, xpatfr, xpatffi, and xpatffw.
Figure 14-1: DLXS XPAT database Maintenance Components
modify
uallua a in
database that requires maintenance and the Append DB is the small amount text that is to be merged into the main database. The New Text consists of the newly arrived text and the modified text from the main database. The modified deletion the ppenerates delete regions that are appended to the Deletion Reions f ie. After the New Text has been accumulated to a certain size, various build commands, such as dbbuild,can turn the New Text into the Append DB. After the preparation of input
input
merged \ I xpatmaint
add
the figure above is completes various component be used to merge xpatmainthe Main DB, the Delete Regions file database that requires mainland the Append DB into a new Main DB, which reflects that is to be mergedupdates. into the main database. The New Text consists of the newly arrived text and the modified text from the main database. The modified text from the Main DB generates delete regions that are appended to the Deletion Regions file. After the New Text has been accumulated to a certain size, various build commands, such as dbbuild5O,can turn the New Text into the Append DB. After the preparation of the Append DB is completed, xpatmaint can be used to merge the Main DB, the Delete Regions file and the Append DB into a new Main DB, which reflects all the updates.
Suppose we have a sample text file, called 'main'. Assume 'main' contains the following text:
<p>This is line l</p>
<p>This is line 2</p>
This text contains two short paragraphs, each surrounded by <p> and < /p> tags. Because xpatmaint uses the start and end positions of the regions it updates, the following lines identify the start and end positions of the two paragraphs of our sample text. These positions will be used in the rest of the example.
<p>This is line l</p>
1x 22
<p>This is line 2</p>
23 44
Note: The end locations (22 and 44) actually point to the newline characters that follow the two < / p> tags. The available database maintenance operations are update, add, and delete and are discussed in Sections - , respectively.
Suppose we wish to update the second line in the 'main' database - the line between locations 23 and 44 inclusive. This is done in two operations: adding the new text and deleting the old text. The new text will be in an append file, which we will call 'append', and will consist of the following line:
<p>This is the new and improved line 2</p>
In order to specify the portion of the main text to delete, we must have a deletion region file, which we will call 'del'. This file contains one line for each region of the 'main' database that is to be deleted. Since we only wish to delete one line, the 'del' file will only contain the following line:
23 44
patmaint requires that both the 'main' file and the 'append' file have complete XPAT databases built on them. This includes an associated DD ('.dd') file and a Main Index ('. idx') file. If the 'main' or 'append' files have regions specified, the region ('.-gn') files are also considered part of a complete XPAT database.
To perform the actual update operation, the xpatmaint program is invoked as follows:
% xpatmaint -D main.dd -d del -a append.dd
In this mode, xpatmaint will work quietly. This means that it will not display any information regarding what it is doing. Should you choose to view what xpatmaint does as it works, you can add a '-v' option to the command line shown above. The '- ' option turns "verbose" (or descriptive) mode on. After the update operation is complete, the main file will look like the following:
<p>This is line l</p>
<p>This is the new and improved line 2</p>
In this scenario, we take the original 'main' database and append the 'append' file to the end of it. However, unlike the first scenario, we do not want to delete any portion of the 'main' text. To achieve this result, we invoke xpatmaint without the delete option. This means that no text will be deleted. The following command will perform the desired action:
% xpatmaint -D main.dd -a append.dd
After the update operation is complete, the main file will look like this:
<p>This is line l</p>
<p>This is line 2</p>
<p>This is the new and improved line 2</p>
This next scenario will demonstrate how to delete a line of text from the main database without adding any new text. The xpatmaint program always requires an append option. Therefore, to delete text without adding any text, we must create an empty append database, which we will call 'empty_append'. This database essentially consists of an empty text file, along with a DD that refers to it. This DD only needs a Text segment and an Indices segment. The Text segment should refer to the empty text. The Indices segment only needs to be present for xpatmaint to recognize the DD as a valid DD; it doesn't need to refer to an actual index file (since xpatmaint doesn't use the append database's index). We then use the same setup as before, except that the append database specification is 'emptyappend. dd':
% xpatmaint -D main.dd -d del -a empty_append.dd
After the completion of the update operation, the main file will contain only the following:
<p>This is line l<p>
The xpatmaint program uses an algorithm that efficiently computes the exact modifications that it has to perform on each file. This algorithm is faster than updating the text file manually and completely rebuilding the XPAT database. However, there are assumptions about the characteristics of the main database and the append database:
If xpatmaintS5is given a main database and an append database, it will add the append text to the end of the given main text If deletion regions are specified to xpatmaint, the corresponding regions will be deleted from the main text. In either case, xpatmaint updates the Main Index, the Region Subindices, the Fast Find indices and the Fast Region indices. The update operation involves six steps:
After the above six steps have been performed, the main database's DD file is updated. The xpatmaint program physically modifies various files of the main database (the text file, the Main Index file, the Region Subindex files, the Fast Find index files and the Fast Region index files). Thus, at certain periods of xpatmaint execution, the database may be inconsistent and must be off- line. Refer to Section for more information on when the database must be off-line and when it can be on-line.
Another point to note regarding xpatmaint's mode of operation involves crash recovery. While xpatmaint is modifying the various index files of the main database, the database can be considered corrupted from a user's point of view. If xpatmaint is aborted before it has finished modifying all the files (e.g., accidentally or due to a machine crash), the database will be left in a corrupted state and will not be usable. As such, it is important to BACK UP YOUR DATABASE BEFORE RUNNING xpatmaint!
patmaint is generally used to perform regular additions of new text to an existing database and to delete some regions from the old text. Performing the addition operation generally works as follows:
The above operations are usually placed in a script file to automate the process. The Customer Interaction Database (Section 14.4) provides a good example of how a system of shell scripts can be used to automate the process.
Note: Since xpatmaint re-indexes the append file during the update operation, the use of xpatbld in Step 2 above is only to provide a valid DD file for the append database. The index that it produces is superfluous and is never used by xpatmaint. As such, the index need not be built with the same character mappings or index point specifications as the main database. At present, there is no option for xpatbld to fake the index-building step and simply produce a valid DD. However, the time taken to build the index can be eliminated by having the index-building script simply build an initial DD file from a pre-generated template in place of running xpatbld in Step 2.
The new text does not need to construct the Fast Find indices because xpatmaint will use the main database DD to determine whether these indices are required. If the main database contains these . - indices, the Fast Find indices for the new text will be automatically generated in main memory and prepared to be merged.
The Fast Region indices will be rebuilt either as defined in the main database or in the append database after both databases have been merged. Therefore, if the main database contains the definition for the Fast Regions, the same Fast Region is not required to be constructed for the new text.
There is one important point to note about building the append database. The entries in the append database's DD that specify the various files in the database should contain full pathnames. This is necessary because the main database's DD usually has relative file pathnames (to allow the main database to be easily relocated in the file system without requiring changes to the DD). Because of this, xpatmaint is usually run from the main database's directory. As such, full pathnames are necessary in the append database's DD to provide unambiguous references to its various files from anywhere in the file system (in particular, from the main database's directory).
The xpatmaint program requires a deletion regions file to specify the portions of the main text that are to be removed. The deletion regions file consists of one or more lines. Each line corresponds to a separate deletion region and contains two numbers. These numbers are the 1-based positions of the first character and the last character in the region that is to be deleted (' I-based' means that the first character in the file is at position 1, and not position 0). The region positions must be monotonically increasing and no region should overlap another region. Some typical entries in the deletion regions file would be:
120 345
790 930
3502 5607
The above file would specify the deletion regions: start at 120, end at 345; start at 790, end at 930; and start at 3502, end at 5607, inclusively. The positions are monotonically increasing and no regions are intersecting. Violating the former specification rules will produce unpredictable results.
The xpatmaint program consists of five major stages. The first stage consists of a full scan of the main text to generate the index update directives. The second stage consists of an in-place update of the main text and the Main Index, using the update directives produced in the first stage. The update of the main text involves physically appending the append database's text to the end of the main database's text file, and deleting the sections of text specified in the deletion regions file. The third stage consists of merging the append database's region files with those of the main database. The fourth stage consists of merging the Fast Find indices from the main database and the append database. Finally, the fifth stage consists of rebuilding all the defined Fast Region indices from both the main database and the append database. For each stage, there are different index files that are being read and updated. The following table summarizes the operations:
File
File Read File Updated R emoved
Stage Operations Before or Generated Aer
After
1 scanning the main data pmt_dir none
database and dictionary'
generating the appndxt
update directives
main text
2 rolling over text file main text, main text, none
and rolling over Main main index, main index,
Index.
pmtdir pmt_svdir
3 rolling over Region region indices region indices, pmt_dir
__| | Subindiccs |data dictionary
4 merging Fast Find fast find fast find indices, pmt_sv_dir
indices and Word List indices, data dictionary
indices pmtsv dir
5 rebuilding Fast none fast region none
Region indices index,
data dictionary ,l
During the first stage of xpatmaint, users can still search the database since the main and append database text files are only scanned, not physically changed. However, the main database must be taken off-line for the 2nd to 5th stages. The first stage generally takes much longer to run than the other stages. As such, it is sometimes convenient to have stage I run while users are using the database (e.g. as a low-priority process during the day), and then run stages 2 to 5 afterwards (e.g. at night). This policy can be implemented using the partial execution options to xpatmaint. If xpatmaint is run with only the '- 1' option specified, it will only perform stage I and will write the index update directives into a file called 'pmt_dir' in the current directory. When the time comes to perform stages 2 to 5, xpatmaint can be executed with the '- 2', '- 3', '-4' and '- 5' options specified. The xpatmaint program will then read the update directives from the 'pmt_dir' file and update the index and region indices. If stage 4 is required, the 'pmtsv_dir' directives file will be created during stage 2 processing. It is required to have 'pmt_sv_dir' directives to update the Fast Find indices. Stage 5 does not need any directives.
There is another benefit from these options. Even if no partial execution options are specified, xpatmaint still writes the update directives to the 'pmt_dir' file after it has finished stage 1. The 'pmtdir' directives file is only removed after stage 3 completes. Should a machine crash occur after stage 2, it is only necessary to restore the index and region files before re-running xpatmaint with the '- 2' and '- 3' options specified. Should a machine crash occur during stage 3, only the region files would need to be restored before re-running xpatmaint with the '- 3' option specified. After stage 2 is successfully completed, the 'pmtsv_dir' directive files will be created. As long as the 'pmc_sv_dir' is created successfully, if the machine crashes during stage 4, only the fast find index files would need to be restored before re-running xpatmaint with the '-4' option .. specified.
Stage 5 will completely rebuild all Fast Region indices specified either in the main database or in the append database. Therefore, no file is needed to be restored before re-running xpatmaint with the '-5' option specified.
The following times are the execution characteristics of xpatmaint running on a Sun SPARCstation 2. Before the first stage begins, a setup stage is performed. The time required to perform the setup is related to the size of the append text. The setup stage for a 1 MB append text typically requires about 20 seconds. Refer to the Stage Operation Summary table in the previous section for description of various stages.
During stage 1, the scan rate for the main text is logarithmic in the size of the append text but tends to level off when the size of the append text file exceeds 2 MB. Using the '- o' optimization option, a typical scan rate for a I MB append text file is 140 KB/sec. So, stage I requires time equal to the size of the main text divided by the effective scan rate. Stage 2 merges the append index with the main index. The rate at which this stage progresses varies with the relative sizes of the main and append texts. When the append text is 10% the size of the main text, the processing rate is typically 300 KB/sec.
Stage 2 requires time equal to the size of the main index (not the main index) divided by the effective processing rate.
Stage 3 merges the append region files with the main region files. The time required to do this depends on the total size of all the region files from both databases but a typical processing rate is 400 KB/sec.
Stage 4 merges the append fast find index files with the main fast find index files. The time required to do this depends on the size of the fast find index files and the number of delete regions.
Stage 5 rebuilds all fast region indices. The time required will be the same as the time required to run xpatfr independently on each fast region.
When all the above stages (except stage 4 and stage 5) are combined, the overall processing rate for the addition of an append text that is 1 % the size of the main text, on a Sun SPARCstation 2, is approximately 120 KB/sec. If stage 4 is including, the speed will be approximately 40 KB/sec. In addition, if stage 5 is included, the speed will be further decreased in proportion to the number of fast regions needed to be rebuilt.
For example, if there is a 500 Mbyte main database and a 5 Mbyte append database, the time required to merge them together is approximately 1.2 hours. If stage 4 is also required to update the Fast Find indices, the time required to merge them together is approximately 3.6 hours. On the other hand, the complete rebuilding of the Main index alone by using xpatbld, will require approximately 58 hours to reconstruct. The time for using xpatmaint is dramatically faster than rebuilding the indices.
For different databases, the time characteristics vary depending on the previously described parameters, such as the size of the main database and the number of delete regions. Therefore, it is always a good practice to keep track of your specific database time statistics. The verbose '-v' option and the logfile '- -1' option can be used to monitor and record the xpatmaint execution. Time statistics will be reported at each stage. A typical xpatmaint execution with verbose mode turned on will generate the following output:
% xpatmaint -v -o -1 logfile -D main.dd -d del -a app.dd
** xpatmaint version 5.x.x (incr: xxxx)
** user_name Sat Oct 1 08:06:16 1994
Setting up big database (main.dd) for merging
Setting up small database (app.dd) for merging ...
Setting up update records for merging ...
Perform stage #1 merging
* Setup took (1.8000) cpu seconds
* Index 10840 points
Scanning the big text .
* Scan took (25.6666) cpu seconds (115.8265 K/sec)
Computing the delete text ...
* Compute deletes took (0.1333333) cpu seconds
Perform stage #2 merging ...
Rolling over text file (main.txt) .
5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% ...90% 95% 100%
* Rolling over text took (1.6333) cpu seconds (1847.2320 K/sec)
Rolling over index file (main.idx) ...
5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% ...90% 95% 100%
* Rolling over index took (5.7166) cpu seconds (389.2368 K/sec)
Perform stage #3 merging ...
Rolling over regions ...
Merging region (Entry ) into (main.rgn)
Rolling over region file (main.rgn) ...
5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% ...90% 95% 100%
* Rolling over index took (2.0000) cpu seconds (444.9765 K/sec)
Perform stage #4 merging ...
Rolling over Fast-Find index ...
* Rolling over Fast-Find index took (15.7500) cpu seconds
Rolling over word-list index ...
* Rolling over word-list index took (16.5666) cpu seconds
Perform stage #5 merging .
Rebuilding Fast-Region ...
Rebuilding region (Entry ) Fast-Region file (Entry. fri)
* Rebuilding Fast-Region took (0) cpu seconds
* Total time (69.0999) cpu seconds (43.0228 K/sec)
*** user_name Sat Oct 1 08:08:59 1994
The time statistics are reported in the form of'took (xx. xxxx) cpu seconds'. If the file size is measurable, the relative speed for size per second is reported as ' (xxx. xxxx K/sec) '.It is not only useful for collecting time statistics. The various files, which have been updated, are clearly reported in the output. Any trouble can be spotted easily if a problem occurred. For example, the file cannot be accessed or the disk is full. Knowing the stage where the problem occurred is important for restarting xpatmaint with the partial execution option (refer to Section for details).
Note: If the problem can not be resolved, the error can be more precisely described with the logfile output. In the previous example, the logging messages are stored in 'log file' specified by the '-1' option. The logfile output is exactly the same as the verbose mode output.
This section describes a sample application based on the xpatmaint database maintenance utility. As described in the preceding sections, xpatmaint provides maintenance capabilities for XPAT databases via the deletion and addition of regions of text. By using xpatmaint, these deletions and additions can be performed in-place, without re-indexing the entire text database. While xpatmaint can be used in a stand-alone environment for database maintenance, it can also be used as a building block for more complex systems. This section describes such a system. This system uses a number of Unix shell scripts and a few small programs written in C to provide a "record-oriented" database system. The system has been implemented using Unix shell scripts in order to be readily modifiable and adaptable by developers and integrators. The source code for the C language portions of this sample application are also provided in the distribution.
This documentation first describes the system characteristics for this sample xpatmaint application. It then describes a fictitious database which will be used throughout the remainder of this section. It then describes the directory layout and installation procedures for this sample xpatmaint application. The remainder of this section focuses on providing information which will help developers and integrators to modify the collection of scripts that make up this system, to suit their needs. This discussion describes the various database operations in terms of their functionality and their implementation as shell scripts. Each shell script is also heavily commented with descriptions of its implementation and rationale.
When used as a stand-alone program, xpatmaint deletes specified regions from the database, and appends new text to the end of the database. In this sample application we will use xpatmaint for the bulk addition, deletion, and modification of the regions in a database.
Since xpatmaint can delete records from, and add records to a database, it can be used in a situation requiring the modification of a region. To modify a region, the text of the region must first be retrieved from the XPAT database and must be modified. Then xpatmaint is used to delete the old region from the database and append the new, modified version of the region to the end of the database.
In this sample application we elaborate on the point of using the xpatmaint add/delete function as a modify function and construct a more complex application. We assume that the database we are working with is a "record-oriented" database. By "record-oriented" we mean that there is a single type of region (field, element) which spans the entire database. We refer to this spanning region as a record. For example, in an encyclopedia the record may be an Entry; in a newspaper database it may be a Story; and in a document database the record may be a Document. In all of these cases, all of the text searches in the database are contained within the record: there is no text outside a record. However, there are several records.
With this record-oriented database in mind, our example system allows a record in the database to be "checked out", modified and "checked in" to the database. The process of "checking out a record" involves storing the location of the record being checked out, copying the text of the record to a temporary file, marking the record for deletion and providing the text of the record for editing. Once that text has been modified, it is "checked in" by putting the new modified version of the record in a spool area. This spool area holds all the modified pieces of text that have to be reincorporated into the main database. At regular intervals, all the pieces of text in this spool area are reintegrated into the main database. :
This sample application also provides a locking mechanism which ensures that once a record is checked out for modification it cannot be modified by another user. As with all of the other functionality in this sample application, the locking mechanism is implemented as Unix shell scripts to allow modification by developers.
In order to provide an example which addresses a slightly more complex application environment, the database in this example is a record-oriented database with three types of records. All of the text in the database is guaranteed to fall within one of these three record types.
The sample application also has a concept of database views. Checked in texts are accumulated in a spool area. An "update" task is periodically initiated to add these texts to the database and to delete the previous copies of these records from the database. This update operation does not get performed on the original database. Instead, a copy of the original database is made and the update is performed on the copy; the original database is left as-is. In this manner, snapshots of the database are available as they were before applying a new batch of updates. These copies of the database are referred to as "views" in the shell scripts and in the following descriptions.
The fictitious application scenario for this sample text database is a system that tracks bug reports, engineering change requests and customers for a software company. Throughout the remainder of the documentation it is referred to as the Customer Interaction Database (CID). It is designed to track the three different parts of the integrated system using three different record types:
The Customer record keeps track of information about different customers. This includes such components as the customer's name, company, address, phone number and fax numbers. The Software record is used internally to track information about software products. This information includes things like the current product version, the author, when it was last updated, which documents describe how it works, and its name. The Report record is used to track software bugs reported by customers. It contains references to both the Customer and Software records, such as the CustomerID and the SoftwareiD, as well as who the bug is assigned to, and what the current status is. As such, it ties the other two record types together.
This gives users of the CID the ability to add and modify bug reports about different software products, as required. So, for instance, when a new bug is reported, a new Report record is added, with the Status field set to Open. When the bug is fixed, that Report record is edited and the Status field is set to Closed. In addition, the cause of the bug and the steps that were taken to overcome it are also recorded. Using these tools and techniques, software bugs and fixes can be tracked.
Before discussing the various operations provided by the CID (Customer Interaction Database), we will discuss the layout of the directories for the CID applications and describe the procedures for installing the CID sample application.
The proper operation of the Customer Interaction Database sample application assumes a particular directory layout for both the shell scripts and the databases that are being modified. Changes to the default configuration can be made by changing the values in the System Configuration Section of the CID's main makefile. Throughout this section, references made to variables in all-caps (e.g. MAIN_ID) are references to variables defined in the System Configuration Section of that makefile.
This is the highest level directory in the tree, and all database operations are performed from here. This is also the directory where the files not related to a particular view of the CID are stored. These view-independent files fall into the following four groups:
For the first group, an empty record file must be given for each record type. The record file names have the form 'database_name.rec' (they have this name because the CID database as a whole can be viewed as three separate databases (one for each record type) that just happen to all exist in the same database file. The second group of files are used when the next update operation is performed. The third group consists of a log file for the append operation. The fourth group of files helps maintain the version number of the most up-to-date view of the database. It corresponds to which database directory the search process should use. The different views of the database themselves are stored in the different Database Directories.
This group of files resides in the System directory. They are used to keep track of what information is to be added and deleted from the main database when the next update operation is performed. There are four files in this group:
The append text file contains the actual text that will be appended to the database. The add file ('.add') is used to track the movement of append text during multiple edit operations. Each line of the add file has two sets of regions. The first set corresponds to a region of the main database. The second set corresponds to the region of the append text that will replace the region in the main database.
The delete file (' .del') contains a list of regions that will be deleted from the main database. This list corresponds to the first two columns of the add file .-
The lock file ('.lok') is used to ensure that only one application can update the append file at any one time. When the lock file has a size greater than zero, no other application is allowed to modify the append text.
The append text, add file, delete file and lock file are all maintained by the system. They are used to track the state of the append text before an update operation is performed.
These directories contain the files that make up each view of the Customer Interaction Database. As described above, whenever the database is updated, a copy of the most recent version is made and the current set of updates is applied to the copy instead of the original. By performing updates in this way, users who are searching the most recent version of the database when the update is performed can continue searching that version without any interruption. Note, however, that one aspect of functionality does change after an update.
Consider a user who starts a search session on the CID. When that session starts, the system automatically starts searching the most recent version. However, after an update is performed, that version is no longer the most recent version. Since the system will only check out records from the most recent version, that user will no longer be able to check records out of the version of the database that he or she is searching, after the update has been performed.
Multiple views of the database are also maintained to aid in database recovery in the case of corruption due to uncommon events, such as power failures.
Each view of the database consists of a standard XPAT database along with two auxiliary files. The XPAT database consists of the Main Index ('.idx') file, the Region Subindex ('.rgn') file, and the DD ('.dd') file. The first auxiliary file is the xpatinitialization ('.ini') file. It contains the xpat command to declare the Default Region for the database. The second auxiliary file is the PatMotif5O control ('.pat') file. It tells PatMotif50such things as what database to search on, where the help file is, which Routing to use and which CheckOut program to use.
Installation is divided into two parts. The first part (Section ) will describe how to install the programs that the database system will use, as well as some related control files. The second part (Section ) will describe how to set up the database, which is a three step process: configuration, setup, and testing. Each of these will be explained in turn.
To prepare your system for the Database Installation process described below, you must first install the programs that the CID uses. You must also prepare some files in the system directory. Before installing the software, you must ensure that the CIDBIN environment variable points to the directory where you want the program files installed. To install the programs, go to the distribution directory and type:
make
This will install the programs that will carry out the database operations. Next, copy the file named 'make. cid' in the distribution directory to the file named 'makef ile' in the system directory. The 'make f ile' provides the mechanism through which the CID programs are invoked.
Next, prepare a DD file and a PatMotif50 control ('. pat') file. Users who are not familiar with this process should refer to Section 2. 1.1 of this guide, and the PatMotif5OAppendix in the Database Administration Reference Guide) for details on how to create these files. Please note that the PatMotif5O control file must contain the following line in order for the Checkout and Edit functions to work:
<CheckOutProg>make edit CID=O</CheckOutProg>
The 'O' in the above line should be replaced by the number you specify on the 'MAIN_ID' line of the 'make f i 1 e' in the system directory. Please refer to the comments in that file for a description of this field. Also, the DD must have the following line somewhere within the first set of '<Index>' and '< / Index>' tags. Without this line, you will not be able to perform searches.
<InitFile>Main. ini</InitFile>
The 'Main' in the above line should be replaced by the name you specify on the 'MAIN_DB' line of the 'make f ile'. Refer to the comments in that file for a description of this field.
Next, copy the DD and the PatMotif50 control files to the system directory, ensuring that they have '.save' extensions. For example, if the 'MAINDB' line of the 'make file' in the system directory has the name 'Main', then the DD file would be called 'Main. dd. save' and the PatMotif50control file would be called 'Main. pat . save'. You have now installed the programs that the database system will use, as well as some necessary control files. We now proceed to the next step of database setup.
The database installation process requires three steps. First, edit the 'makef i e' to configure the system to your needs. This requires adding the names of the regions which form the different types of records to the 'ALL_DB' line of the 'makef ile'. Second, run the setup process. This will create all the necessary files and directories. Third, test that the setup that was configured actually works as expected. This section will go through each step in the installation process, explaining in detail what is required.Configuration
The names of all the record types that are to be used in the database must be added to the 'ALL_DB' line of the 'make f i 1 e'. A tagged record file for each record type must also be provided. Examples of these tagged record files are given in the system directory of the sample CID application. Please note that the names of the files used to store each record must have a '.rec' extension. Refer to the documentation in the 'makef i 1 e' itself for more details. Please note that under most circumstances, further customization beyond this simple step is not necessary.
The setup process will create all the necessary files and directories for your database. To begin the process, simply enter the following command from the system directory:
make setup
Should the setup process fail to complete (e.g., because incorrect configuration information was specified in the 'makefile'), entering,
make clean_all
will remove any files that were created. Any required changes can then be made before restarting the setup process.
The testing process will ensure that the system is functioning properly by testing each function individually. The sixteen steps follow:
make add DB=record_name
1. Commit the addition of the new record by typing:
make update
2. 3. Test that the search process is working by typing the following. (Note: If you are not familiar with PatMotif50, please refer to the PatMotif Tutorial to help you complete steps (4) and (5).)
make search
3. Search for a string that you know exists in the record you added.
4. Use PatMotifSO's checkout facility to edit the record (supply dummy values for the filename and comment when PatMotifSO asks).
5. Quit PatMotif50.
6. Commit the new changes you made to the record by typing:
make update
7. Search for a string that you know exists in the changes you made to the record - the search should return at least one hit.
8. Test the recover operation by typing the following. This should recover the previous version of the database.
make recover
9. Test that the recover operation recovered to a previous version of the database by typing,
make search
10. Search for a string that you know existed only in the later version of the database - no hits should be returned.
11. Test that the reset operation works by typing,
make reset
12. Test that the reset operation did not change the contents of the database by typing,
make search
13. Search for a string that you know existed previously - you should have exactly the same number of hits.
14. Remove the test database by typing,
make clean_all
15. Reinitialize the system for operation by typing,
make setup
Should the testing process fail at any point, please review your schema files and any other configuration modifications you made and attempt the setup and testing process again.
In this section we describe each of the functions supported by the CID sample application. As we mentioned earlier, the CID system is constructed by using Unix shell scripts and a few small filters written in C, for which source code is provided. Each description will cover the script which must be invoked to execute the operation, and the various scripts which are in turn executed by the high level script. While describing these operations, an emphasis is placed on conveying the logic of the operations, their interrelationships and the scripts which perform each step of the operation. Each database operation description is followed by a brief description of the corresponding script. The details on how each script performs its functions are described with comments within the scripts - themselves.
There are nine types of database operations (listed below) that are used to manipulate the Customer Interaction Database. The function of each of these operations will be described here in terms of the data flow and shell scripts that are called.
(calls: <nothing>)
When the setup process is complete, any database that has already been created can be placed in the database directory. All further operations will then be run against this database instead of the empty initial database.
(calls: cid_add)
(calls: cid_search)
(calls: 'ci dedit', which calls: 'cid_get_region_name', 'cidr eg', 'cid_int',
'cidintadd', 'cid_grep', 'cid_intmov', 'cid_trim', 'cid_size', 'cid_intupd',
' id_intdel')
(calls: 'cid_update', which calls: 'cid_size', 'cid_multiregion')
(calls: 'cid_recover')
(calls: 'cid_reset' which calls 'cid_size')
(calls: <nothing>)
The 'make clean' command removes the append text files and creates new empty ones. This has the effect of ignoring the current update information. This operation is completely controlled from the make f i e. Before the operation is carried out, the user is asked whether the clean process should actually take place. If the user says 'no', the user is informed that the operation was not carried out and the process ends. Otherwise, the cleaning process is carried out as described above and the user is informed when the process completes.
(calls: <nothing>)
WARNING: This operation will completely remove ALL databases and files related to this application. This should only be performed if you are absolutely certain that the database environment is to be completely removed.
The 'make cleanall' operation completely removes all databases and all files related to the Customer Interaction Database environment. Before the operation is carried out, the user is asked whether the clean process should actually take place. If the user says no, the user is informed that the operation was not carried out and process ends. Otherwise, the cleaning process is carried out as described above and the user informed when the process completes.
RESETCOUNT:
This tells the system how many views of the database should be kept after the reset operation. For instance, if you had six views of the database and wished to keep the most recent three, you would set RESET_COUNT equal to 3. The default number is I and is set in the makefile. This parameter can be set on the command line, in the environment, or in the make f i le, in descending order.
BATCH:
This tells the system whether you want to have the reset operation run in batch or interactive mode. In interactive mode, you will be asked whether you wish to save old views of the database in . save' directories or have them deleted. This question will be asked for each view of the database. There are two sub-modes to batch mode. The first, save mode, will save old views of the database in '.save' directories. This is equivalent to replying 'yes' to whether you wish to save an old view in interactive mode. The second batch mode, nosave mode, will remove all but the RESET_COUNT most recent database views and will renumber these starting from MAIN_ID. This is equivalent to answering no every time in interactive mode.
This parameter can be set on the command line, in the environment, or in the makefile, in order of descending precedence. The default is NOBATCH and is set in the makefile.
Example: Setting the BATCH Parameter:
No matter which of the following methods you use, the value you specify for the BATCH parameter will be used whenever the 'make reset' command is issued. The methods shown here can also be used for the RESET_COUNT parameter.
To set the BATCH parameter in the make f i le, use your editor to edit the makef i 1 e. Move to the following line of the System Configuration Section:
BATCH = NOBATCH
To change the setting of the BATCH parameter, simply edit the right side of the equals sign from NOBATCH to whatever value you want.
Setting the BATCH parameter on the command line can be done as follows:
make reset BATCH=NOBATCH
Again, you can change the parameter setting by changing the value to the right of the equals sign.
Setting the BATCH parameter in the environment when you are using the Bourne Shell can be done with the following commands:
export BATCH
BATCH=NOBATCH
You can change the setting of the parameter by changing the value to the right of the equals sign in the second line above.
To set the BATCH parameter in the environment when using the C Shell, issue the following command:
setenv BATCH NOBATCH
This will set the BATCH parameter to a value of NOBATCH in your C Shell environment.