[Catalog][Search][Home][Tell Us][Help]

ATS Logo
Academic Text Service (ATS)

380 Meyer Library
email: ats@lists.stanford.edu
phone: 725-3163

[Home | Searcher | OED | Services | Texts | Web Access | English Poetry | Other Sites | Papers | Staff]


Mounting an E-Text in the PAT Environment, Step by Step

So far, we've taken a relatively theoretical look at delivering electronic texts. Now let's consider a somewhat more practical example by going through the steps in mounting an individual text (or set of texts) within the OpenText (aka PAT) environment, glossing where necessary.

We're assuming that the text to be mounted is in SGML. While this is not a requirement for the PAT system, the steps that we describe here are dependent on this. Texts with other structures, e.g. text with other forms of tagging or untagged text, would require slightly different processing.

The first two steps result in an SGML document that is valid for your chosen DTD. The creation/tagging/validation process will need to be iterated until the document passes the validation. PAT will not build SGML indexes on documents that are not valid, and you'd want to have a valid, portable, system-independent document instance, right?

The next step is simple, but depends on your having established a document management environment that is 1) easy to append to; 2) easy to access; 3) easy to back up. For a smallish database/delivery system, these features may appear to be unimportant, but as your database and number of texts grows, planning for growth and maintenance at the outset pays off substantially. Depending on how users will access the files (local client, commercial client, Web, etc.), the organiztion of the user-visible database structure may also play a significant role (i.e. how will the user find the texts or corpora he or she will want to access?).

The usual database building process consists of creating a configuration file that specifies the indexing characteristics (word index, character index, stop terms, handling of special characters, etc.), the location of the files and files to be indexed together, SGML regions of the text to be built, and resulting names and locations of the index files. The nature of your application will determine the complexity of the index building operation; simple ones will use the build scripts that come with PAT, more complex ones may need individual runs or scripting. After the index build, it's advisable to test the index operations against known values to be sure that all has gone as planned.

If you are using the OpenText clients and viewing program (aka Lector), you'll need to write configuration files that establish the user's view into the textual database. This includes defining how the information is to be presented, the main textual region in the document (e.g., a poem in a book of poetry, perhaps), and how many views are possible. The viewers also require configuration so that the presentation of SGML information makes sense visually. For complicated DTD's, the configuration files can be time-consuming to prepare, and they are not always reusable for other types of information.

Similarly, configuration files and/or perl scripts for Web access tailored to the current e-text may also need to be created. If you're interested in serving SGML data directly by using SGML viewers like Panorama (currently the only one, but not, I'd predict, for long), then you'll also need to created navigators, the Panorama equivalent of the configuration files.

As the final step in the technical development, you'll want to be sure that the environment and view into the text file or corpus that you've created works as you expect. If it does, see if the procedures and configuration files that you've created for this file cannot be used as models for similar files. If you're dealing with files that have been created elsewhere against DTD's that are unlike the one you've instituted for your own text creation (you do have your own DTD worked out by now, right?), this may not be possible, but in many instances your work may be reusable.

When the technical development of a new e-text is complete, all that remains is to hook it into your delivery environment, be it through the Web or another client. If you are using an external client such as those from OpenText, you may need to distribute or otherwise make available the configuration files that call the new resources.

Equally important, however, is to be sure that your users know that the new resource is available: connections to your OPAC, announcements via listservs, direct mailings, Web pages, classroom presentations are all good ways of making availability known. You've probably developed a plethora of means of getting the word out to patrons for other resources; now's the time to use the same ones for your e-texts as well.

Previous | Next

Last Updated: July 17, 1995