Each DLXS XPAT database has a Data Dictionary containing information to:
A Data Dictionary is made up of several sections. Each section is delimited by ``tags'' - short labels enclosed by angle brackets: `<' and `>'. For example, information about the database's text is preceded by a <Text> start tag and followed by a </Text> end tag. The slash (`/') character is used to distinguish between end tags and start tags. The entire Data Dictionary is enclosed by <DB> and </DB> tags.
Each section and field in the Data Dictionary is described separately in the following paragraphs. The Data Dictionary contains a Thesaurus field, a Text section, an Indices section, and a Regions section.
The Thesaurus field is enclosed by <Thesaurus> and </Thesaurus> tags. It contains the name of a file with thesaurus definitions. The format of this file is described in the `thesaurus' entry of the XPat Reference Manual and Tutorial. The filename can be specified using either a relative path or an absolute path.
The Text section is enclosed by <Text> and </Text> tags. It contains information relating to the text itself. Specifically, it contains an MfsFiles section, which describes all the individual files that make up the database.
The MfsFiles section is enclosed by <MfsFiles> and </MfsFiles> tags. The fields within the MfsFiles section are described in the mfs(5) man page. Refer to that man page for the details.
The Indices section is enclosed by <Indices> and </Indices> tags. It contains one or more Index sections.
The Index section is enclosed by <Index> and </Index> tags. It contains information about a single, named Main Index. Specifically, it contains a Name field, a FastFind section (if a Fast-Find index has been built on this Main Index), a File section, an InitFile field, an IndexPoints section, a Mappings section, and an IntegrityCheck field.
The Name field is enclosed by <Name> and </Name> tags. It names the index contained within the enclosing Index section. It is used when invoking xpat to specify which index is to be used in searching. The first Index section may have an empty Name field. All other Index sections must have non-empty Name fields.
The FastFind section is enclosed by <FastFind> and </FastFind> tags. It contains a FastFindCompression section, a FastFindIndex section and a FastFindWordList section. These sections describe information for each of the three files that constitute the FastFind index. Note that these sections are present in the Data Dictionary only if a Fast-Find index has been built on the database (this is always the case for MFS databases).
The FastFindCompression section is enclosed by <FastFindCompression> and </FastFindCompression> tags. It contains one File section.
The File section is enclosed by <File> and </File> tags. It specifies the FastFind Compression file. It contains a SysName field, a ModDate field, and an Offset field.
The SysName field is enclosed by <SysName> and </SysName> tags. It contains the file's filename or path.
The ModDate field is enclosed by <ModDate> and </ModDate> tags. In contains the last modification date of the file, encoded as a number. The database system maintains this number to ensure that the database hasn't been changed in an unauthorized manner.
The Offset field is enclosed by <Offset> and </Offset> tags. It contains the logical starting offset of the current information within the file. This field is usually set to 0, except in Region sections. Refer to the Region section, below, for details.
The FastFindIndex section is enclosed by <FastFindIndex> and </FastFindIndex> tags. It contains one File section that specifies the main Fast-Find Index file. The contents of the File section is described in the section on FastFindCompression, above. Refer to that section for details.
The FastFindWordList section is enclosed by <FastFindWordList> and </FastFindWordList> tags. It contains one File section that specifies the Fast-Find Word List file. The contents of the File section is described in the section on FastFindCompression, above. Refer to that section for details.
This section specifies the Main Index file. The contents of the File section is described in the section on FastFindCompression, above. Refer to that section for details.
The InitFile field is enclosed by <InitFile> and </InitFile> tags. It contains the name of a file which is read by xpat during initialization. Any legal xpat command may be contained in the initialization file. Typical uses are setting the DefaultRegion, defining macros, or defining a match set or region set commonly used in a xpat session. Refer to the XPat Reference Manual and Tutorial for more information on the valid Pat commands.
The IndexPoints section is enclosed by <IndexPoints> and </IndexPoints> tags. It contains one or more IndexPt section.
Each IndexPt section is enclosed by <IndexPt> and </IndexPt> tags. These fields contain strings which indicate points in the text which should be indexed.
The simplest index point is simply two characters, such as <IndexPt>ab</IndexPt>. This example instructs xpatbld to create an index point each time an ``ab'' occurs in the text. For each such occurrence, an index point is generated for the ``b''.
Since listing each two-letter combination to index can be cumbersome, each IndexPt section can contain meta-characters. A meta-character stands for a number of characters. For instance, the meta-character `&uppercase.' represents the characters ``ABCDEFG...'' and so on. An index point containing <IndexPt> &uppercase.</IndexPt> (note the space immediately preceding the `&' character) is equivalent to specifying the following:
<IndexPt> A</IndexPt> <IndexPt> B</IndexPt> <IndexPt> C</IndexPt> and so on...
A meta-character may appear in place of either the first character, the second character, or both. The following meta-characters are defined:
!@#$%^&*()_+~|1234567890-=`\{}:"<>?[];',./ abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
\241 \242 \243 \244 \245 \246 \247 \250 \251 \252 \253 \254 \255 \256 \257 \260 \261 \262 \263 \264 \265 \266 \267 \270 \271 \272 \273 \274 \275 \276 \277 \300 \301 \302 \303 \304 \305 \306 \307 \310 \311 \312 \313 \314 \315 \316 \317 \320 \321 \322 \323 \324 \325 \326 \327 \330 \331 \332 \333 \334 \335 \336 \337 \340 \341 \342 \343 \344 \345 \346 \347 \350 \351 \352 \353 \354 \355 \356 \357 \360 \361 \362 \363 \364 \365 \366 \367 \370 \371 \372 \373 \374 \375 \376 \377
\300 \301 \302 \303 \304 \305 \306 \307 \310 \311 \312 \313 \314 \315 \316 \317 \321 \322 \323 \324 \325 \326 \331 \332 \333 \334 \335 \340 \341 \342 \343 \344 \345 \346 \347 \350 \351 \352 \353 \354 \355 \356 \357 \361 \362 \363 \364 \365 \366 \371 \372 \373 \374 \375 \377
\300 \301 \302 \303 \304 \305 \306 \307 \310 \311 \312 \313 \314 \315 \316 \317 \321 \322 \323 \324 \325 \326 \331 \332 \333 \334 \335
\340 \341 \342 \343 \344 \345 \346 \347 \350 \351 \352 \353 \354 \355 \356 \357 \361 \362 \363 \364 \365 \366 \371 \372 \373 \374 \375 \377
\241 \242 \243 \244 \245 \246 \247 \250 \251 \252 \253 \254 \255 \256 \257 \260 \261 \262 \263 \264 \265 \266 \267 \270 \271 \272 \273 \274 \275 \276 \277 \320 \327 \330 \336 \337 \360 \367 \370 \376
The following meta-characters represent single characters which are special in the syntax of the Data Dictionary:
&. & &backspace. \b <. < >. > &return. \r &newline. \n &tab. \t
The Following meta-characters are defined for Unicode support. Note the code points are specified in ranges using the Unicode 'U+' notation.
!@#$%^&*()_+~|1234567890-=`}:"<>?[];',./ abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
!@#$%^&*()_+~|-=`\{}[]:";'<>?,./
The following meta-characters represent single characters which are special in the syntax of the Data Dictionary:
&. & &backspace. \b <. < >. > &return. \r &newline. \n &tab. \t
The following scripts are based on UnicodeData.txt and Perl 5.8 unicore/lib files.
U+0041-U+005A U+0061-U+007A U+00AA-U+00AA U+00BA-U+00BA U+00C0-U+00D6 U+00D8-U+00F6 U+00F8-U+0220 U+0222-U+0233 U+0250-U+02AD U+02B0-U+02B8 U+02E0-U+02E4 U+1E00-U+1E9B U+1EA0-U+1EF9 U+2071-U+2071 U+207F-U+207F U+212A-U+212B U+FB00-U+FB06 U+FF21-U+FF3A U+FF41-U+FF5A
U+0531-U+0556 U+0559-U+0559 U+0561-U+0587 U+FB13-U+FB17
U+0981-U+0983 U+0985-U+098C U+098F-U+0990 U+0993-U+09A8 U+09AA-U+09B0 U+09B2-U+09B2 U+09B6-U+09B9 U+09BC-U+09BC U+09BE-U+09C4 U+09C7-U+09C8 U+09CB-U+09CD U+09D7-U+09D7 U+09DC-U+09DD U+09DF-U+09E3 U+09E6-U+09F1
U+3105-U+312C U+31A0-U+31B7
U+1740-U+1753
U+13A0-U+13F4
U+0400-U+0481 U+048A-U+04CE U+04D0-U+04F5 U+04F8-U+04F9 U+0500-U+050F U+0901-U+0903 U+0905-U+0939 U+093C-U+094D U+0950-U+0954 U+0958-U+0963 U+0966-U+096F
U+1200-U+1206 U+1208-U+1246 U+1248-U+1248 U+124A-U+124D U+1250-U+1256 U+1258-U+1258 U+125A-U+125D U+1260-U+1286 U+1288-U+1288 U+128A-U+128D U+1290-U+12AE U+12B0-U+12B0 U+12B2-U+12B5 U+12B8-U+12BE U+12C0-U+12C0 U+12C2-U+12C5 U+12C8-U+12CE U+12D0-U+12D6 U+12D8-U+12EE U+12F0-U+130E U+1310-U+1310 U+1312-U+1315 U+1318-U+131E U+1320-U+1346 U+1348-U+135A U+1369-U+137C
U+10A0-U+10C5 U+10D0-U+10F8
U+00B5-U+00B5 U+037A-U+037A U+0386-U+0386 U+0388-U+038A U+038C-U+038C U+038E-U+03A1 U+03A3-U+03CE U+03D0-U+03F5 U+1F00-U+1F15 U+1F18-U+1F1D U+1F20-U+1F45 U+1F48-U+1F4D U+1F50-U+1F57 U+1F59-U+1F59 U+1F5B-U+1F5B U+1F5D-U+1F5D U+1F5F-U+1F7D U+1F80-U+1FB4 U+1FB6-U+1FBC U+1FBE-U+1FBE U+1FC2-U+1FC4 U+1FC6-U+1FCC U+1FD0-U+1FD3 U+1FD6-U+1FDB U+1FE0-U+1FEC U+1FF2-U+1FF4 U+1FF6-U+1FFC U+2126-U+2126
U+0A81-U+0A83 U+0A85-U+0A8B U+0A8D-U+0A8D U+0A8F-U+0A91 U+0A93-U+0AA8 U+0AAA-U+0AB0 U+0AB2-U+0AB3 U+0AB5-U+0AB9 U+0ABC-U+0AC5 U+0AC7-U+0AC9 U+0ACB-U+0ACD U+0AD0-U+0AD0 U+0AE0-U+0AE0 U+0AE6-U+0AEF
U+0A02-U+0A02 U+0A05-U+0A0A U+0A0F-U+0A10 U+0A13-U+0A28 U+0A2A-U+0A30 U+0A32-U+0A33 U+0A35-U+0A36 U+0A38-U+0A39 U+0A3C-U+0A3C U+0A3E-U+0A42 U+0A47-U+0A48 U+0A4B-U+0A4D U+0A59-U+0A5C U+0A5E-U+0A5E U+0A66-U+0A74
U+1100-U+1159 U+115F-U+11A2 U+11A8-U+11F9 U+3131-U+318E U+AC00-U+D7A3 U+FFA0-U+FFBE U+FFC2-U+FFC7 U+FFCA-U+FFCF U+FFD2-U+FFD7 U+FFDA-U+FFDC
U+2E80-U+2E99 U+2E9B-U+2EF3 U+2F00-U+2FD5 U+3005-U+3005 U+3007-U+3007 U+3021-U+3029 U+3038-U+303B U+3400-U+4DB5 U+4E00-U+9FA5 U+F900-U+FA2D U+FA30-U+FA6A
U+1720-U+1734
U+05D0-U+05EA U+05F0-U+05F2 U+FB1D-U+FB1D U+FB1F-U+FB28 U+FB2A-U+FB36 U+FB38-U+FB3C U+FB3E-U+FB3E U+FB40-U+FB41 U+FB43-U+FB44 U+FB46-U+FB4F
U+3041-U+3096 U+309D-U+309F
U+0C82-U+0C83 U+0C85-U+0C8C U+0C8E-U+0C90 U+0C92-U+0CA8 U+0CAA-U+0CB3 U+0CB5-U+0CB9 U+0CBE-U+0CC4 U+0CC6-U+0CC8 U+0CCA-U+0CCD U+0CD5-U+0CD6 U+0CDE-U+0CDE U+0CE0-U+0CE1 U+0CE6-U+0CEF
U+30A1-U+30FA U+30FD-U+30FF U+31F0-U+31FF U+FF66-U+FF6F U+FF71-U+FF9D
U+1780-U+17D3 U+17E0-U+17E9
U+0E81-U+0E82 U+0E84-U+0E84 U+0E87-U+0E88 U+0E8A-U+0E8A U+0E8D-U+0E8D U+0E94-U+0E97 U+0E99-U+0E9F U+0EA1-U+0EA3 U+0EA5-U+0EA5 U+0EA7-U+0EA7 U+0EAA-U+0EAB U+0EAD-U+0EB9 U+0EBB-U+0EBD U+0EC0-U+0EC4 U+0EC6-U+0EC6 U+0EC8-U+0ECD U+0ED0-U+0ED9 U+0EDC-U+0EDD
U+0D02-U+0D03 U+0D05-U+0D0C U+0D0E-U+0D10 U+0D12-U+0D28 U+0D2A-U+0D39 U+0D3E-U+0D43 U+0D46-U+0D48 U+0D4A-U+0D4D U+0D57-U+0D57 U+0D60-U+0D61 U+0D66-U+0D6F
U+1810-U+1819 U+1820-U+1877 U+1880-U+18A9
U+1000-U+1021 U+1023-U+1027 U+1029-U+102A U+102C-U+1032 U+1036-U+1039 U+1040-U+1049 U+1050-U+1059
U+0B01-U+0B03 U+0B05-U+0B0C U+0B0F-U+0B10 U+0B13-U+0B28 U+0B2A-U+0B30 U+0B32-U+0B33 U+0B36-U+0B39 U+0B3C-U+0B43 U+0B47-U+0B48 U+0B4B-U+0B4D U+0B56-U+0B57 U+0B5C-U+0B5D U+0B5F-U+0B61 U+0B66-U+0B6F
U+16A0-U+16EA U+16EE-U+16F0
U+0D82-U+0D83 U+0D85-U+0D96 U+0D9A-U+0DB1 U+0DB3-U+0DBB U+0DBD-U+0DBD U+0DC0-U+0DC6 U+0DCA-U+0DCA U+0DCF-U+0DD4 U+0DD6-U+0DD6 U+0DD8-U+0DDF U+0DF2-U+0DF3
U+0710-U+072C U+0730-U+074A
U+0710-U+072C U+0730-U+074A
U+1760-U+176C U+176E-U+1770 U+1772-U+1773
U+0B82-U+0B83 U+0B85-U+0B8A U+0B8E-U+0B90 U+0B92-U+0B95 U+0B99-U+0B9A U+0B9C-U+0B9C U+0B9E-U+0B9F U+0BA3-U+0BA4 U+0BA8-U+0BAA U+0BAE-U+0BB5 U+0BB7-U+0BB9 U+0BBE-U+0BC2 U+0BC6-U+0BC8 U+0BCA-U+0BCD U+0BD7-U+0BD7 U+0BE7-U+0BF2
U+0C01-U+0C03 U+0C05-U+0C0C U+0C0E-U+0C10 U+0C12-U+0C28 U+0C2A-U+0C33 U+0C35-U+0C39 U+0C3E-U+0C44 U+0C46-U+0C48 U+0C4A-U+0C4D U+0C55-U+0C56 U+0C60-U+0C61 U+0C66-U+0C6F
U+0780-U+07B1
U+0E01-U+0E3A U+0E40-U+0E4E U+0E50-U+0E59
U+0F00-U+0F00 U+0F18-U+0F19 U+0F20-U+0F33 U+0F35-U+0F35 U+0F37-U+0F37 U+0F39-U+0F39 U+0F40-U+0F47 U+0F49-U+0F6A U+0F71-U+0F84 U+0F86-U+0F8B U+0F90-U+0F97 U+0F99-U+0FBC U+0FC6-U+0FC6
U+3400-U+4DB5 U+4E00-U+9FA5 U+FA0E-U+FA0F U+FA11-U+FA11 U+FA13-U+FA14 U+FA1F-U+FA1F U+FA21-U+FA21 U+FA23-U+FA24 U+FA27-U+FA29
The Mappings section is enclosed by <Mappings> and </Mappings> tags. It consists of two distinct parts. The first part is a list of Map sections, each of which maps a character, enclosed by <From> and </From> tags, to another character, enclosed by <To> and </To> tags. The most common use is to map uppercase letters into their lowercase equivalents, or punctuation into spaces.
It is also possible to map ranges of characters to their lower case equivalents (where this concept is applicable). The beginning character of the range enclosed in <First> and </First> is followed by the last character in the range enclosed in <Last> and </Last>. These two tag pairs are enclosed by <CharRange> and </CharRange>. The <CharRange> tag pair is enclosed by the <From> and <To> tag pairs as described for a single character above. For example:
<From> <CharRange> <First>A</First> <Last>Z</Last> </CharRange> </From> <To> <CharRange> <First>a</First> <Last>z</Last> </CharRange> </To>
Note: When xpat starts up, it first builds an initial map which maps all non-ASCII and all non-printable characters to NULL. xpat then reads the user-defined character mappings defined in the Mappings section and adds those specifications to the initial map. The user-defined mappings override the default mappings. One use of character mappings is to map selected non-printable characters to themselves. This effectively undoes the NULL mapping that xpat creates for those characters by default.
Two escape mechanisms exist to specify non-printable characters in the From and the To fields. The first mechanism is octal specification. Each octal specification consists of a backslash followed by three octal digits (e.g., `\003' for `^C'). The second mechanism is entity reference specification. The following table illustrates the entity references that can be used. The characters in the right-hand column can be specified using the corresponding entity reference in the left-hand column:
&. & &backspace. \b <. < >. > &return. \r &newline. \n &tab. \t
Each of the From and To fields can contain at most one character, one octal code, or one entity reference. If a To field is empty, it means that the corresponding From character should be mapped to NULL.
The second part of the Mappings section is a list of stopwords - words which are not indexed. The words themselves are enclosed by <Ignore> and </Ignore> tags. The whole list is enclosed by <StopWords> and </StopWords> tags. Note that stopwords are not supported by xpatbldu, the Unicode enabled version of xpatbld.
The IntegrityCheck field is enclosed by <IntegrityCheck> and </IntegrityCheck> tags. This field contains a single number that encodes relevant information about the indexing parameters to ensure that the descriptive information in the Data Dictionary matches the information used to actually create the index. It is maintained by the programs that build and maintain indices (e.g., xpatbld and xpatmaint). The IntegrityCheck value is also checked by xpat on startup. If an integrity error is detected, xpat will print an error message to that effect and will not search the database.
The Regions section is enclosed by <Regions> and </Regions> tags. It usually contains one or more Region sections. However, it may be empty or omitted if no regions are defined.
Each Region section is enclosed by <Region> and </Region> tags. It contains information defining a region of the database. Regions are used by xpat in the ``within'' and ``including'' commands (refer to the XPat Reference Manual and Tutorial for more information).
Each Region section has zero or more FastRegion sections, a Name field, a Desc field, a File section, a Count field, and a Type field.
Each FastRegion section is enclosed by <FastRegion> and </FastRegion> tags. Each FastRegion section contains information defining the FastRegion index between the enclosing region and a specific Main Index. Within a particular Region section, there can be at most one FastRegion section for each Index section in the Data Dictionary. The FastRegion sections are created by the xpatfr program when it builds the FastRegion indices.
Each FastRegion section contains a File section and an IndexName section.
The File section is enclosed by <File> and </File> tags. It specifies the actual file that contains the FastRegion index data for the enclosing FastRegion section. The contents of the File section are described in the FastFindCompression section above. Refer to that section for details.
The IndexName section is enclosed by <IndexName> and </IndexName> tags. It specifies the name of the Main Index in this Data Dictionary for which this particular FastRegion index was built. The index name in this field has to be the same as the Name in one of the Index sections in this Data Dictionary. This field can be empty if the FastRegion was built on the default index (which does not have a name).
The Name field is enclosed by <Name> and </Name> tags. It contains the name by which that region is referenced in xpat.
The Desc field is enclosed by <Desc> and </Desc> tags. It contains a description of the region and may be empty or omitted.
The File section is enclosed by <File> and </File> tags. It indicates where to find the file containing the region's pointers into the text. The contents of the File section are described in the FastFindCompression section above. Refer to that section for details. Note that the Offset field with the File section may be non-zero. This is because the region building programs place the index data for several regions inside a single file. The Offset specifies where in that file the current region's segment begins.
The Count field is enclosed by <Count> and </Count> tags. It gives the number of pointers for this region. Note that this number is twice the number of regions defined because each region in a region set consists of a start pointer and an end pointer.
The Type field is enclosed by <Type> and </Type> tags. The only type that is currently supported is the `pairs' type (where each region is explicitly defined by a start and an end pointer).
This section is enclosed by <Grammar> and </Grammar> tags and is reserved for future XPAT use.
This section is enclosed by <Display> and </Display> tags and is reserved for future XPAT use.
The following is the Data Dictionary for a complete database. Note that parts of some sections have been removed to reduce the size of the example.
<DB> <Thesaurus>/usr/ot/default.the</Thesaurus> <Text> <MfsFiles> <FileMap>mydb</FileMap> <FilterChain> <SearchView>meta</SearchView> <DisplayView>meta</DisplayView> <RawView>meta</RawView> <DisplayFmt>ASCII</DisplayFmt> <DefaultDataTag></DefaultDataTag> <FileGroup> <MfsDir>data</MfsDir> <MfsFile>*.txt</MfsFile> <MfsExpand>file</MfsExpand> </FileGroup> </FilterChain> </MfsFiles> </Text> <Indices> <Index> <Name>default</Name> <File> <SysName>/usr/ot/manual/def.idx</SysName> <ModDate>679335524</ModDate> <Offset>0</Offset> </File> <InitFile>/usr/ot/manual/init</InitFile> <IndexPoints> <IndexPt> &alphanumeric.</IndexPt> </IndexPoints> <Mappings> <Map><From></From><To></To></Map> <Map><From>&backspace.</From><To> </To></Map> <Map><From>&tab.</From><To> </To></Map> <Map><From>&newline.</From><To> </To></Map> <Map><From>&return.</From><To> </To></Map> <Map><From>!</From><To> </To></Map> <Map><From>"</From><To> </To></Map> <Map><From>#</From><To> </To></Map> <Map><From>$</From><To> </To></Map> <Map><From>%</From><To> </To></Map> <Map><From>&.</From><To> </To></Map> ...Note: Some text deleted. <Map><From>A</From><To>a</To></Map> <Map><From>B</From><To>b</To></Map> <Map><From>C</From><To>c</To></Map> <Map><From>D</From><To>d</To></Map> <Map><From>E</From><To>e</To></Map> ... Note: Some text deleted. <Map><From>~</From><To> </To></Map> <StopWords> <Ignore>a</Ignore> <Ignore>an</Ignore> <Ignore>and</Ignore> <Ignore>are</Ignore> ... Note: Some text deleted. <Ignore>with</Ignore> </StopWords> </Mappings> <LongestMatch> <Length>0</Length> <Resolution>0</Resolution> </LongestMatch> <IntegrityCheck>1846024038</IntegrityCheck> </Index> <Index> <Name>word</Name> <File> <SysName>/usr/ot/manual/word.idx</SysName> <ModDate>679335592</ModDate> <Offset>0</Offset> </File> <IndexPoints> <IndexPt> &printable.</IndexPt> <IndexPt>-&alphanumeric.</IndexPt> <IndexPt>&alphanumeric.-</IndexPt> <IndexPt>&printable.<.</IndexPt> </IndexPoints> <Mappings> <Map><From></From><To></To></Map> <Map><From>&backspace.</From><To> </To></Map> <Map><From>&tab.</From><To> </To></Map> <Map><From>&newline.</From><To> </To></Map> <Map><From>&return.</From><To> </To></Map> <Map><From>!</From><To> </To></Map> ... Note: Some text deleted. <Map><From>~</From><To> </To></Map> <StopWords></StopWords> </Mappings> <LongestMatch> <Length>0</Length> <Resolution>0</Resolution> </LongestMatch> <IntegrityCheck>736122026</IntegrityCheck> </Index> </Indices> <Regions> <Region> <Name>cmd</Name> <Desc>Illustrations of xpat commands.</Desc> <File> <SysName>/usr/ot/manual/rgn.cmd</SysName> <ModDate>679335629</ModDate> <Offset>0</Offset> </File> <Count>672</Count> <Type>pairs</Type> </Region> <Region> ... Note: Some text deleted. </Region> <Region> ... Note: Some text deleted. </Region> </Regions> </DB>
The following paragraphs describe the contents of the Files section used in Release 4.x Data Dictionaries.
The Files section is enclosed by <Files> and </Files> tags. It contains one File section that describes the text file (in Release 4.x database, the text was in ASCII or tagged ASCII format, and was in a single file). The contents of the File section are described in the FastFindCompression section above. Refer to that section for details.
The following sections of the Data Dictionary reference specific files:
<Thesaurus> thesaurus file <Text><Files> database's text <Indices><Index><File> index over the database <Indices><Index><InitFile> initialization commands for xpat <Regions><Region><File> region files
xpat(1), xpatbld(1), xpatmaint(1), xpatrgn(1), multirgn(1), sgmlrgn(1), xpat_export(5), regions(5)