XPATRGN

Section: User Commands (1)
Updated: November 2000
Index Return to Main Contents
 

NAME

xpatrgn - XPAT region file builder  

SYNOPSIS

xpatrgn [ -v ] [ -d region_description ] [ -r region_name ] [ -p patterns_file ] -o region_file -D data_dictionary  

DESCRIPTION

xpatrgn builds the region_file for the database specified by data_dictionary, using the patterns specified in the patterns-file. xpatrgn also updates the `Region' section for region_name in the data_dictionary. If region_name is not specified on the command line, then xpatrgn uses the prefix of the region_file as the region name. If the -p option is not specified, xpatrgn expects the region patterns on its standard input (e.g., from previous programs in a pipeline). After the region is built, it is referred to in xpat as region_name. Refer to the regions(5) man page for more information on the format of the region_file that xpatrgn produces.

The region patterns in the patterns_file consist of pairs of starting and ending strings, one pair per line. xpatrgn will search for occurrences of these string pairs in the text and record their offsets in region_file. Once a starting string has been found, xpatrgn will search for the first occurrence of the corresponding ending string in order to end the region. Nested occurrences are ignored. Regions begin on the first character of the starting string and end on the last character of the ending string. These positions may be modified by adding or subtracting an integer value, as shown in the example below. If the ending string of any pair is not given in the input, xpatrgn will begin regions on occurrences of the starting string, and will end the regions on the character before the first character of the next region. If the end of the text is reached in the middle of a region, the program will record the location of the last character in the text as the end position of the last region.

Note: this algorithm is different than that used by xpat to make regions during a search session. Consider the text,

    ( a b ( c d ) ( d e f

and the region pattern,

     "(" ")"
(i.e., build regions between the `(' and `)' characters). xpatrgn would build the regions as `( a b ( c d )' and ` ( d e f'. xpat, on the other hand, would find all the matches which could start a region and all the matches which could end a region. It would then take the nearest pairs. For the above text, xpat would record the single region, ` ( c d )'. It would not record a region for either `( a b ', or `( d e f'.

The special character sequences `\^' and `\$' will match the first and last characters in the text, respectively.  

OPTIONS

-v
Specify verbose mode. This option tells xpatrgn to print progress messages to the standard output, as it builds the index. By default, xpatrgn works silently.
-d description_text
Specify the region description. Each `Region' section in the Data Dictionary contains a `Desc' field. This field contains a description of the region (which is used in Help screens in user interfaces, among other things). xpatrgn will place the description_text in the `Desc' for the region xpatrgn is building. Note that if this text consists of more than one word (the normal case), it should be surrounded by quotes.
-r region_name
Specify the region name. By default, xpatrgn uses the prefix of the region_file as the region name. This option is useful if the region name and the region filename are different. Note that if region_name contains spaces, you should surround it with quotes.
-p patterns_file
Specify the region patterns file. By default, xpatrgn expect the patterns on the standard input.
 

EXAMPLES

The input pattern,

      "\n"

creates regions that are located between newline characters. Note that these regions will start at each newline character and there will be no region created for the first line (the text before the first newline).

The input pattern,

      "\^"
      "\n" + 1

creates a region for each line in the file, starting on the first character in each line. This pattern will also include the first line in the file.

The input pattern,

     "<Headline>" +10 "</Headline>" -11

creates regions between `<Headline>' and `</Headline>' tags, except that the actual regions begin on the first letter after the `<Headline>' tag, and end on the last letter before the `</Headline>' tag. This is different from the actions of multirgn, which includes the tags.

The command,

     xpatrgn -p my_patrns.ptn -o Patrn1.rgn -D text.dd

builds a region for the database specified by the Data Dictionary, `text.dd'. It uses the patterns specified in `my_ptrns.ptn' and puts the index in the file `Patrn1.rgn'. It names the region, `Patrn1'.

The command,

     xpatrgn -v -d "This is my pattern" -r "My Pattern" -p my_patrns.ptn
     -o MyPat.rgn -D data.dd

builds a region for the database specified by the Data Dictionary, `data.dd'. xpatrgn will print progress messages as it builds the index. It will record the description, `This is my pattern' in the Data Dictionary entry for the region it builds. It will name the region, `My Pattern'. It will get the patterns from the file, `my_patrns.ptn'. Finally, it will place the index in the file, `MyPat.rgn'.

The command,

     ptrn_prog | xpatrgn -o Patrn1.rgn -D text.dd

builds a region called `Patrn1' for the database specified by `text.dd'. It will put the index in the file, `Patrn1.rgn'.  

SEE ALSO

xpat(1), multirgn(1), regions(5), data_dict(5)


 

Index

NAME
SYNOPSIS
DESCRIPTION
OPTIONS
EXAMPLES
SEE ALSO

This document was created by man2html, using the manual pages.
Time: 18:03:38 GMT, March 26, 2001