This describes the where and how of loading media (images or other media including audio and video) to a server for web access via Image Class. Image Class supports a wide variety of media formats, including specialized support for JPEG2000 and MrSID images. The conversion of files to formats suitable for use in Image Class is not covered.
The basic steps here include loading media files to a web server and into a prescribed (though flexible) directory structure and then executing a program called "imageprep" which builds an index directory containing references to the image/media files. The index directory is used by the Image Class CGI middleware, and by the data record transformation scripts, as a consistent means of locating the image files of a collection.
Image Class is configured by default to handle specific directory structures and filenaming conventions. The directory structures and conventions are configurable to support a wide variety of situations. Image Class is not limited to images. Many media formats can be delivered.
Image Loading takes place prior to the loading of data records, except in cases where a database does not have digital image files.
Image Class supports by default the following formats, but is configurable to handle others:
Please refer to the Image Class Image File Naming document for the details.
By default, Image Class allows the following filename extensions, but is configurable to handle others:
It is best/easiest if filenames are unique within a collection. If they are not, the subdirectory path can be used to force uniqueness. To enable this function, edit the file at $DLXSROOT/bin/i/image/localimageprep.cfg adding the following code to the "COLL SPECIFIC OVERRIDES/ADDITIONS" section
if ($coll eq 'collid') { $gLoadedName = 'loaded'; }
Be sure to replace "collid" with the collection's ID!
Image files must be stored using the following collection level directory convention:
For example, the unique collid for the collection "French Architecture" is "sampleic", in which case the image file directory is...
$DLXSDATAROOT/img/s/sampleic
The software assumes that every image has a thumbnail image and a larger display image.
Within the collection directory, images should be stored in the following locations.
To clarify with an example, the Image Class middleware assumes that all JPEG files in directories named "thumbjp2" and within the $DLXSROOT/img/c/collid structure are thumbnails to be displayed in association with JPEG2000 images (stored in "jp2" directories. All JPEG images not in thumb directories are assumed to be single resolution images for large display.
Example:
The "sampleic" collection can again be used as a simple example. $DLXSROOT/img/s/sampleic has the following directories...
drwxrwxr-x 4 jweise dlps 512 Feb 15 14:37 index
drwxrwxr-x 2 jweise dlps 2048 Feb 15 13:30 sid
drwxrwxr-x 2 jweise dlps 2048 Jun 8 1998 thumb
$DLXSROOT/img/s/sampleic/sid contains all of the SID format, multiple resolution, files for the collection.
$DLXSROOT/img/s/sampleic/thumb contains all of the JPEG format, thumbnail size, files for the collection.
The sampleic collection does not have any large JPEG files since it relies on SID files for large display.
The flexibility of structure within the collection specific image directory is intentional and supports the variety of directory structures that are typically encountered.
For example, at Michigan, we find it useful to load the image files on the production server in a structure that reflects the CD that the master image files are stored on. A single collection can easily have dozens of CDs worth of master image data (typically TIFF format files). In the process of generating SID and JPEG files, we maintain at least the name of the CD in the name of the directory that the SID and JPEG images are kept in.
...
CD0005
Image directories (that are loaded to the production server in the $DLXSROOT/img/c/collid directory)
thumb
...
CD0005
thumb
It is fine for the images to be loaded in this type of hierarchical structure. When the "imageprep" program (discussed below) creates the index directory, it recursively parses directories to find all image files.
Image files need to be readable by the web server, which often runs as user "nobody".
Image directories should be 775 and image files should be 664. The "imageprep" program will attempt to properly set all permissions for all image directories and files. However, if the user excecuting "imageprep" does not have the necessary permissions to change the mode of the directories and files, the "imageprep" program will not be successful in its attempt, and will generate a message reporting this is the case.
Alternatively, the chmod command can be used to set permissions. There are ways to modify files in batch with UNIX commands, but this topic is beyond the scope of this document.
chmod 775 $DLXSROOT/img/m/musart/sid
chmod 664 abc.sid
The Image Class middleware (the CGI) expects to find in the index directory references to the actual image files. The index directory is generated automatically by the imageprep program. For each set of similarly named image files (e.g., JPEG thumbnail and large SID) a .inf file is created. For each thumbnail a symlink is also created.
Advantages of the index directory are:
A program is provided called "imageprep" that recursively locates image files in a collection's image directory and builds .inf files and symlinks. It also capitalizes all filename extensions it creates within the idnex directory.
It is typically invoked on the command line as follows...
$DLXSROOT/bin/i/image/imageprep collid
For example:
$DLXSROOT/bin/i/image/imageprep sampleic
Each time the "imageprep" program is executed for a collection, a completely new index directory is generated. To minimize downtime the new index directory is created in a directory called "indexprep". At the end of execution imageprep moves the current index directory out of the way and the new index directory in to place. The old index directory is then automatically deleted.
If an image filename is not unique, only the newest instance that is encountered will be included in the index directory.
table
the table arg is also optional. if "table" is included in the command, filenames will be put into a mysql table named collid_filenames. if such a table already exists, it is first dropped. imageprep does not use the filenames table in anyway, and at this time it is not required for other data prep steps. it simply may be useful in certain situations.
nosymlinks
the nosymlinks argument is optional. if nosymlinks = "nosymlinks", thumbnail images will be copied to the index directory. Normally symlinks are made from the index directory to the location of the actual thumbnail image file. this functionality has been added specifically to make it possible to distribute Image Class sample data without needing to run imageprep as part of installation.
Image Class can be configured to support a wide variety of directory and filenaming conventions. This configuration is done in the $DLXSROOT/bin/i/image/localimageprep.cfg file. Copy the %gTypeHash, and @gTypeComarisonOrder definitions from imageprep.cfg to localimageprep.cfg and modify it to support local conventions
It is even possible to add support for multiple image sizes for a single image in a non-MrSID format such as JPEG. That is, you could have small, medium, and large JPEG images for a single item and have them all be available to the user in the interface. Michigan has done this for the APIS (Papyrus) collection where there is a mix of MrSID and JPEGs at multiple sizes (Though the examples are few and far between. In fact I can't find one at the moment.) Zooming is not possible without MrSID, but the user may select from the multiple sizes.
Below is the %gTypeHash Michigan added to localimageprep.cfg to handle the APIS collection. The hash holds for each type of image an array of regular expressions used to match image files. For APIS, thumbnail images must either be JPEG files in a "thumb" directory or GIF files with "-tn" preceding the extension. SID images are also allowed and can reside anywhere. JPEGs that aren't thumbnails are assumed to be large image files and are given the label "1200", which is a somewhat arbitrary estimation of the maximum pixel dimension of the file. JPEGs with "-50" preceding the extension are labelled as large JPEGs with maximum pixel dimensions of 600. It does not matter too much what the labels are as long as they cause the images to sort properly by size in the user interface.
The @gTypeComparison order array is important because it specifies an order of precedence for identifying images. In this case "thumb" is checked first, and if the filename matches one of the "thumb" regular expressions, it is not tested against the other types (i.e,. sid, 600, 1200).
if ($coll eq 'apis') { %gTypeHash = ( 'image:::dynamic:-:thumb' => [ '/thumb/([^/]+)\\.(jpg)', '/thumb/([^/]+)\\.(JPG)', ], 'image:::fixed:-:thumb' => [ '/([^/]+)-tn\\.(gif)', '/([^/]+)-tn\\.(GIF)', ], 'image:::dynamic:-:sid' => ['/([^/]+)\\.(sid)', '/([^/]+)\\.(SID)'], 'image:::fixed:-:600' => ['/([^/]+)-50\\.(jpg)', '/([^/]+)-50\\.(JPG)'], 'image:::fixed:-:1200' => ['/([^/]+)\\.(jpg)', '/([^/]+)\\.(JPG)'], ); @gTypeComparisonOrder = ('image:::dynamic:-:thumb','image:::fixed:-:thumb', ' image:::fixed:-:600','image:::fixed:-:1200','image:::dynamic:-:sid'); }
Contact dlxs-help@umich.edu if you need assistance with this advanced topic.
If all has gone well, at this point you should have the image files loaded, symlinks generated, and permissions properly set. This is all you have to do to the image files. The database transformation process will be able to locate the image files, as will the CGI middleware.