Browsing - Overview and Contents
Through a combination of database tables
and collmgr field configuration,
dynamic browsing is available through the DLXS middleware. When the proper
CGI-based URL is received, the middleware checks for certain metadata in the
database tables, packages some of it in XML, uses some of it, as is in its
stored format of XML, and lets XSLT format the results. At run time, no XPAT queries are needed, only MySQL queries are run against the item browse tables
in the dlxs
database.
After the tables are prepared, that is, populated with item metadata through
the $DLXSROOT/bin/browse/updatebrowsedb.pl script (aka "database populator"),
configuration for the behavior of dynamic browsing is accomplished by modifying
certain collmgr fields.
Provisions will soon be made for creating static HTML browse pages with the
database populator script.
Contents:
Item browse database tables
There are three tables in the dlxs database that are specifically used for
dynamic browsing in the middleware:
- ItemColl
- ItemBrowse
- ItemBrowseCounts
The ItemColl table holds, for each item/document and collection combination,
one row containing the following columns:
- the idno
- the collection id
- the modification date of the row's information
- XML metadata
about the item: in the case of TextClass, this is simply the DLXSTEXTCLASS/HEADER element that is retrieved from an XPAT query in the "database
populator" script. In the case of ImageClass, the Perl subclass used
by the database populator grabs information from the MySQL or XPAT data
and wraps it in specific XML before filling in this field.
The ItemBrowse table holds, for each item/document's browseable field (e.g.,
author, title, etc.):
- the idno
- the collection id
- the field name
- the value of the field
- rank (not currently used)
The ItemBrowseCounts table holds, for each collection, a list of rows containing:
- the colleciton id
- the field name
- the first character or the first two characters of the sortable
field's value (sortable title, author's name)
- the count of items that begin with that first character or those first
two characters
Collmgr fields for configuration
The main fields in the collmgr that need to be set properly are:
- locale
- browsenav
- browsefields
- browseupdatemodule
See Configure the collmgr fields below for
more information.
Preparing collections for browsing
Configure
the collmgr fields
Start collmgr and change the following fields:
- devhost: if you are running the database populator in a development environment
(that is, where DLPS_DEV environment variable
is set), you can have the middleware use XPAT-indexed data that is
on a machine different from the usual host. This can be useful for testing
purposes.
- locale: (this should be changed to a UTF-8 type of encoding, e.g., en_US.UTF-8)
- browseable A "yes" (case-insensitive) value in this field enables the browse tab in the user interface. If a file in the collection-specific web directory for this collection contains a file named
browse.html that page will be served but only if browsefields is empty. Fallback is applied to select the correct browse.html file for collection-specific customization. This supports static browse pages. If browse.html is not present a dynamic browse page will be served based on data from the browse database. When a dynamic browse page is served the browsenav field value is consulted and must be defined.
- browsenav: enter 0, 1, or 2. If you want no paging, that is, that all
items in the colleciton appear on one HTML page for the user to browse,
enter 0. Enter 1, if you want "one level of browsing", that is, that a
separate page be created for each first character of the value in question
(e.g., title or author) and that a navigation bar be built that allows
the user to navigate to each page, for example, jump to the page listing
items whose value begins with "M". Entering 2 in this field will create
a "two-level browse", where two navigation bars will be created. The first
bar will allow the user to jump to items whose values begin with a particular
first character (e.g., jump to the records that begin with "B"). The second
navigation bar will allow the user to jump to items whose values begin
with a particular two-character combination, (e.g., records that begin
with "Bu"). This decision is left to the collection coordinator. We have
found that the level is based on how many total items there are in
the collection and therefore what is a reasonable number of browseable
items for a single HTML page.
- browsefields: list the browseable fields for the collection. For example,
some collections may have only title browsing, others may need both title
and author, etc. Leave this field empty to enable static browsing.
-
browseupdatemodule: specifies
the name of the browse update Perl module that will be used by
the updatebrowsedb.pl script to populate the database. This
value is analogous to the appmodule and
subclassmodule fields. (This field exists as of Release 12a;
it supersedes a Perl configuration hash used in Release 12.)
The module files are located in
DLXSROOT/bin/browse. If a dynamic browse page is to
be served this field must have a value. Specialized behavior
can be obtained by subclassing the browse update modules.
The currently available browse update module values are as follows:
- ImageClass
- BrowseUpdate/ImageMysqlBU
- FindaidClass
- BrowseUpdate/FindaidBU
- TextClass
-
- encodingtype = monograph
- BrowseUpdate/MonographBU
- encodingtype = serialissue
- BrowseUpdate/SerialIssueBU
Note that newspapers are serialissue encodingtype.
- encodingtype = serialarticle
- BrowseUpdate/SerialArticleBU
Populating the item browse
tables
To initially populate or to update the item browse tables, there is a
script called
updatebrowsedb.pl which is located in $DLXSROOT/bin/browse.
Running this program will populate or update the rows necessary in each
of the three ItemBrowse related tables. These tables will be queried when
the user requests browsing from the middleware.
However, please note: unless you are making changes to or
need to debug updatebrowsedb.pl, you should use the "wrapper" shell
script provided in the same subdirectory. This wrapper is called ub and
was written to ensure that updatebrowsedb.pl
- is
run from the "release" directory and not from a particular developer's
directory (for more information about a development environment which
uses multiple developers' directories and environments, click
here)
- runs with certain environment variables properly set
- assumes the use of the "production" row, if no row is specified, when
setting the host from which data will be read
$DLXSROOT/bin/browse/ub -C class -c collection [ -r
row [ -h host ] ] [ -f ] [ -p ]
- -f : is optional but if supplied, the wrapper will run the updatebrowsedb.pl
script without asking for confirmation
- -p : is used to "purge" all records from the browse tables for a
particular collection without re-populating (updating) the collection's
browse information
- -r : the row is optional. Without it, the "production" row will
be assumed and used.
- -h : if row is supplied, host is optional. The script will force updatebrowsedb.pl
to use the host given. If no row is supplied, host cannot be supplied.
If, for any reason, you must override the assumptions made by the "wrapper"
script, you can always run the updatebrowsedb.pl directly by entering:
$DLXSROOT/bin/browse/updatebrowsedb.pl class=AAA c=BBB
host=CCC row=DDD
where AAA is either "text" or "image";; BBB is
the collection id of the collection you want to create browsing for; CCC is
the name of the host on which resides the XPAT index for the collection
(this is not relevant to ImageClass, which uses MySQL for all queries);
and DDD is
key to the row in the database you wish to use (production, dlxsadm, or
an individual developer's id). For example, you may want to point the
script at new or test data on a machine that is different from your production
machine. You could accomplish this by changing the host or devhost field
for the collection in the collmgr. NOTE: If DLPS_DEV is
set when you invoke updatebrowsedb.pl (without
the wrapper ub), the devhost field
will be used; otherwise, the
host field will be used.