Editorial, 2006, October 5:
Using JSTOR as a Source of Journal-Title Abbreviations
|
Background
|
The Inventory of the Oldest Scholarly Societies
|
n
1999 the Editor of the
Scholarly Societies Project created a sub-project entitled the
Inventory of the Oldest Scholarly Societies
(Repertorium Veterrimarum Societatum Litterariarum).
The purpose of the Inventory is to gather information pertaining to
scholarly societies that is likely to be of special interest to
historians.
Pages have therefore been created for each of
several hundred
scholarly societies
founded prior to 1850.
Each page gives basic historical
information, as well as an enumeration of
the important journals of the society,
along with abbreviations used for
the journal titles.
|
Abbreviations Found in Journal Indexes
|
For the first several years of the existence of the Inventory, the
abbreviations were drawn exclusively from
indexes of the journal literature,
such as the
Reuss Repertorium
and the
Royal Society of London Catalogue of
Scientific Papers.
The abbreviations were included on the appropriate history pages, but also
in a composite alphabetical index in an area entitled
Abbreviations Used for the Journal
Titles of Scholarly Societies.
|
Abbreviations Found in the Journals Themselves
|
In the summer of 2005, it occurred to the Editor
that it might be possible to document the journal-title
abbreviations used
in the literature itself by using a
large searchable archive like
JSTOR.
Further background information may found in
The Problem of Early Journal-Title Abbreviations.
|
|
The JSTOR Archive
|
Scope
|
he JSTOR Archive
is a rich repository, containing the full-text of lengthy runs
of hundreds of journals.
The emphasis is on English-language
journals, but some journals in other languages are included.
Most of the runs are within the 20th
century, but some runs extend into the 19th century and earlier.
|
Search Engine
|
The search engine of the JSTOR archive allows one to perform a
single search over the entire set
journal articles in this
immense corpus of scholarly literature.
It is provided with many useful searching
features, including proximity
searching, which allows one to specify how close to one another
the search terms are to appear.
|
Limitations
|
Perhaps the most evident limitation of this archive is that this is a
fee-based services, so that users must
be affiliated with an institution that has a subscription to the resource.
That said, it must be acknowledged that the JSTOR
archive is truly extraordinary, its only other limitations
being rather minor in nature.
Among the remaining limitations:
- The pages are not scalable; the
size displayed is the largest available.
This can be a problem with some journals notorious for their
use of truly miniscule fonts.
- In the results of a search, the
search terms are not highlighted;
hence a careful visual scan of the results is required in order to
determine why the pages were retrieved.
- The program that converted the digital images to searchable text
sometimes does not recognize multiple
columns of text.
Hence text strings that should be contiguous are sometimes broken.
- Text employing diacritical marks
(usually accent marks) is not always well
handled.
For example, the é character appears to be converted to a
simple e over 80% of the time, but is treated as garbage the rest
of the time.
At the other extreme, the ü character appears to to
converted to a simple u less than 10% of the time, and is treated
as garbage the rest of the time.
- Characters in italic are sometimes
poorly handled.
A good example is the rather ornate capital J with a top banner to the
left that is found in some fonts;
this is quite often treated as garbage.
- There are
problems in distinguishing among similar
characters, for example among lowercase l,
uppercase I and the numeral 1.
|
|
Searching the JSTOR Archive for Abbreviations
|
Searching for Abbreviations of a Specific Journal
|
n
order to use the JSTOR archive to to locate journal title abbreviations
used for a specific journal of a scholarly society, the Editor first
creates a sequence of search
expressions that appear likely to result in a
relatively complete search of the archive.
The steps involved in this process described below.
|
Steps in Constructing the Search Expressions
|
There are four steps in creating the
sequence of search expressions:
(1) identifying a minimal set of critical
words in the journal title
(2) coming up with abbreviated
versions of those words
(3) setting an adjacency value, that
is, determining how closely these word fragments need to occur together and
(4) coming up with the final sequence of search
expressions by taking all reasonable combinations of the abbreviated words and
applying suitable
adjacency operators to specify how
closely together the terms should occur.
|
An Example
|
Here is an example of a sequence of searches appropriate for the journals
of the Society of Antiquaries of Newcastle upon Tyne with the
number of hits found on 2006, July 9:
"Soc Antiquaries Newcastle" ~2 = 10
"Soc Antiquar Newcastle" ~2 = 0
"Soc Antiqu Newcastle" ~2 = 2
"Soc Antiq Newcastle" ~2 = 10
"Soc Antt Newcastle" ~2 = 0
"Soc Ant Newcastle" ~2 = 8
|
|
Processing the JSTOR Search Results
|
Assessing the Search Results
|
he results of a seach consist of a set of
pages that match the search criteria.
One must take each page in turn and examine it to locate the matching passage.
Once the match is found, one needs to ask:
Is this a journal-title abbreviation?
One also needs to ask:
Does this correspond to one of the society's
journals?; for example, one needs to ask whether the
volume and year designations are consistent with the
cataloguing data.
|
Recording and Verifying Data
|
Once one is certain that an abbreviation corresponds to a particular
journal, one must document the information.
Since JSTOR journal-page images do not support a copy function,
the abbreviation, and the year and volume cited must be manually transcribed.
The bibliographic information about the citing
source, however, does support a copy function, so that data may
be recorded reliably.
Because the transcription is a manual
operation it must be verified;
this is done by copying the transcribed string and sending it back through the search engine as
an exact phrase.
|
Installing the Data
|
Once the correctness of the data has been verified, it is then made
publically available.
At the time of writing, this is done by adding the abbreviation data to the society's history page in the
area reserved for the journal in question. Taking our example above, the
abbreviation data is given in the
Society of Antiquaries of Newcastle upon Tyne history page.
The abbreviation data is also
added to the
composite
index of journal-title abbreviations
(maintained manually) in the appropriate file.
In the future it is expected that the composite index will be generated
from the data in the history pages.
|
|
Assessing JSTOR as a Source of Journal-Title Abbreviations
|
The JSTOR Search Engine is Less than Perfect
|
s
noted above, the JSTOR search engine is not without certain problems.
It should specially be noted that its poor
handling of diacritical marks
and its frequent misinterpretation of italic
text conspire to reduce the effectiveness of the search engine.
|
JSTOR is Nonetheless Stunningly Useful
|
The usefulness of JSTOR in this application, and its potential for
usefulness in other applications, stems from two
factors.
First, it is a very large archive
that is both broad in that it covers
many academic subjects,
and deep in that it includes journal
runs that cover a considerable time period.
Second, it has a search
engine that allows one to search the entire
archive in one search.
Although the search engine has a few problems, it is nonetheless a very
powerful tool, with many useful search
options.
Taking all things into consideration, the problems with the search engine
pale by comparison with the merits of both the richness of the archive and the
considerable power of the search engine.
It needs to be emphasized that without such a
tool, it would have been impossible
to assemble so large a collection of journal-title
abbreviations directly from the journal literature itself.
|
|
Similar Sources of Journal-Title Abbreviations
|
Why a Tool like JSTOR is Ideal for the Task
|
s mentioned above,
what makes JSTOR so fruitful a source of journal-title abbreviations is
that it satisfies two criteria.
First, it is a very large archive.
Second, it has a search engine that
allows one to search the entire archive in one search.
|
And, Sadly, Some Others are Not
|
Two candidates that satisfy the first criterion, namely being very large
archives, are
Gallica and the
Göttinger
DigitalisierungsZentrum.
Either of them would be an ideal source to search, were it not for the
fact that neither of them have associated search
engines.
|
|
Published 2006, October 5
Jim Parrott, Editor
Scholarly Societies Project, and
Repertorium Veterrimarum Societatum Litterariarum
Sending Email to the Project
|
|