Those of you who find it difficult to grasp the implications of the computer for the storage, manipulation and transfer of information associated with the traditional scholarly disciplines may take some comfort from the fact that the professionals (it would seem) are as unsure about the future as the amateurs. Which is precisely why it is such an interesting period in which to be living and exercising our skills. We have a rare opportunity not only to benefit from the evolving technology, but actually to use that technology to enhance our methods of dealing with information. We have, to put it simply, an opportunity to reconsider the grammar of research. We can - and many do - resignedly admit to an incompetence to deal with computer technology and reflect, wistfully, on the halcyon days when grappling with the sources on which all scholarship is ultimately built involved understood procedures. The grammar of those procedures was sufficiently simple to enable the autodidactic principle, and most of us, I suspect, learned by example. There are no degrees in scholarship and "methods of research" - in most universities occupying no more than a token portion of what is taught as an option - has been a part of the grammar about which elementary introductions to bibliography are notoriously silent. It is assumed that every student knows about libraries, books, bibliographies, catalogues, card-indexes, and all the other tools for research. In the last hundred years these tools have multiplied at a frightening rate and the real threat to enlightenment lies less in the insufficiency of information than in its overwhelming flood.
That flood began almost exactly a hundred years ago. The 1870's witnessed developments in the technology of printing which encouraged the belief that, within a generation, cooperative effort could produce a "universal catalogue". A scheme for a reduction of all knowledge of the books in the world within the framework of a usable catalogue was born in the year preceding the Great Exhibition and was actively promoted by the Society for Arts for twenty-five years. The simplicity and grandeur of the scheme was typical of the time: and it required, for its successful prosecution, nothing less than international cooperation between the libraries of Europe to be secured by the benevolent agency of the Prince Regent. Three factors combined to frustrate this astounding project: the founding, in 1877, of the Library Association; in 1892 of the Bibliographical Society of London; and the publication of the General Catalogue of the Library of the British Museum. The close association which existed between these three institutions in the period up to 1920 has diminished since as the objectives of library management and bibliographical research have become steadily more separate. The dilemma which I bring to your attention is this: the same institution which "invented" the short-title catalogue (for all periods and languages) also pioneered the meticulously detailed bibliographical catalogue (the catalogue of fifteenth-century books) and continues to publish both. With an enthusiasm which took our American cousins completely by surprise, the British Library was persuaded to act as benevolent (if uncomprehending) midwife to the birth of retrospective bibliographical cataloguing in machine-readable form. The infant - ESTC - bears many resemblances to its ancestors, and its outward appearance (as a printed catalogue) provides few clues as to its real potential; and yet it points the way to a generation of research tools which will radically alter the manner in which research is carried out.
I began by offering some comfort to those who find the prospect of adapting traditional approaches to the new technology daunting by suggesting that even professionals are as uncertain as amateurs. Let me illustrate this by reference to a book published in September last year. It is by a distinguished professor, recently retired, who has for many years been the head of a Department of Library and Information Science with a strong bias towards bibliographical studies in English literature. The Function of Bibliography by Roy Stokes is an altogether remarkable book. Written for students of literature, and purporting (I quote the "blurb") to describe "a revolution in the study of literary texts". "In recent years, every question in the study of bibliography has had to be reopened, even the most basic of all What is bibliography?
It is common practice in producing catalogues of collections of incunabula to give the briefest description followed by references to GW and/or BMC. But Wheatley's most important observation concerns the need to reduce redundant effort in cataloguing books. A similar realization that a "once and for all" description available through a complex structure of national and international computer networks to libraries throughout the world has led to the creation of huge bibliographical databases financed by libraries on a "pay as you use" basis. ESTC seems to me an interesting development, for though it can claim to be in the tradition of short- title catalogues it in fact goes further than either STC or Wing in fullness of descriptions and annotations. And because it is in machine-readable form it is susceptible to instantaneous correction and modification. Every ESTC record is, in a very real sense, a tentative description which can be modified as contributions from libraries throughout the world are assimilated. In order to appreciate why ESTC occupies a unique position in the contemporary scene, distinct in important respects from the databases established for modern books, it is necessary to remember that one of the most important discoveries of bibliography in recent times has been the fallibility of the individual copy as a witness of the testimony available for the edition.
Some books, especially contemporary books, composed, printed and bound entirely by mechanical processes lend themselves to formalized description and itemization. Uniformity of output is, after all, a desideratum of modern book-production, and contemporary convention has made possible the concept of the International Standard Book Number (ISBN), or unique identifier. It follows, therefore, that a methodical and formalized set of descriptive principles stands a fair chance of success in identifying, beyond reasonable doubt, precise and universally accepted details for a book produced by precise and universally accepted methods. But a sizeable proportion of the books consulted in research libraries were printed in the period before 1850 (the era of the hand-produced book), and however urgent the need to simplify the cataloguing of books flooding from the presses of the world, there remains a distinct and determined community whose interests focus on the past. Recognition that books produced by hand can manifest peculiarities not normally encountered in books produced by machines is now taken for granted by those concerned with the editing of literary texts. If bibliography is to serve scholarship it follows that a descriptive format, capable of revealing the significant features of a hand-printed book, has to be developed. ESTC has, within limitations imposed by the available funding, succeeded in devising a format which does minimum violence to the evidence yielded by the original work; is compatible with the cataloguing standards that have emerged in the wake of computer cataloguing; and is hospitable to books of different periods and printed in different languages. The medium in which the growing body of information is stored is electronic, and that makes it possible to interrogate the file in ways which have never previously been possible with printed catalogues.
The traditional grammar of research necessitates the creation of what I shall call "disjunctive indexes": these commonly take the form of separate files of cards, or slips, arranged in an order which is meaningful to the compiler. Many bibliographies, and some sophisticated catalogues, are furnished with several such indexes. To take an example from a recently published catalogue of fifteenth-century books in Oxford libraries (outside the Bodleian) by Dennis Rhodes, it is possible, using the indexes he provides, to discover incunables in Magdalen, incunables printed in Bologna, incunables printed by Jenson, and incunables formerly owned by the founder of Corpus, Richard Foxe. The arrangement of the catalogue makes it a simple matter to discover which fifteenth-century editions of Boethius are to be found. Rhodes' catalogue is generous in its indexes, even if the answers to certain questions will require some effort: one must consult three indexes in order to discover which editions of the classics printed in Paris are to be found in colleges founded before 1500; one must read the entire catalogue to discover which books survive in contemporary bindings. This is no great labour, because Rhodes' catalogue describes fewer than two thousand editions in just over two and a half thousand copies. But when catalogues exceed a certain size, indexing becomes impractical, and the labour required to discover items conforming to particular determining features out of the question. The great benefit to be derived from access to a machine-readable catalogue, supported by imaginatively constructed software, is that we can ask multiple questions. The complexity of the questions which can be asked is determined by the structure of the bibliographical record. This is the precise point at which the grammar of conventional cataloguing, conditioned by the appearance of print on a page, has to be abandoned in favour of a grammar more closely associated with Ramist logic.
A computer can index information in a variety of ways. In what is termed keyword indexing every word is keyed to the fundamental referent - the anchor - which is a record number. The difficulty with keyword indexing is that the computer is unable to make semantic discriminations, since words are stored as strings of binary data. So a search on the word garden will certainly retrieve a substantial number of books on horticulture but it will also retrieve titles with other uses of the word. In ESTC a substantial number of elements within the bibliographical record are indexed in this way. Phrase indexing is narrower and is generally used for particular sequences of letters (symbols for libraries, for example). The narrowest (and incidentally the simplest) indexing is reserved for the alpha-numeric code where a letter or number stands for a specific bibliographical feature. Extended use of such codes might well lead to a system for detailed and sophisticated description of numerous features of early printed books. Let me explain.
The development of bibliographical analysis during the past twenty five years has led to a recognition that many features of early printed books deserve particular notice in methodical descriptions. Such features include: binding (material, colour, finishing, ornament, provenance, date); paper (sheet size, quality, mould, watermark); typography; layout; illustration; for books printed after 1800 we will have to take into account method of printing, printing techniques used for illustrations (over a hundred processes are known). There are many others that I shall not trouble you with. Why, you may ask, go to all that effort? The answer is simple. Print on the page (whether it is a bibliography, a catalogue, or an edition of a text) is disposed in a manner which presumes that the reader will wish to have certain questions answered, and the compiler/editor predicates his arrangement on those questions. Bibliographies, manually compiled, can answer some questions. In machine-readable form they are capable of answering questions which require imaginative effort to predict. I never cease to be surprised by the questions which are put to us in ESTC: (1) how many separate poems did Tonson print in folio/quarto/octavo? (2) did the Dublin book-trade suffer the same reduction in output between 1730 and 1750 as London? (3) how many books published in the eighteenth century have the word "consciousness" in the title? (4) how many poems printed between 1701 and 1760 claim to be written in imitation of Milton or Spenser? (5) which travel books concerned with Scandinavia and Iceland were in the library of Sir Joseph Banks?
Unfortunately, ESTC has not found it possible to code bibliographical data in such a way that all questions can be answered. This is because of the sheer number of items to be dealt with (a universe of some 400,000 separate items surviving in over two million copies), and the fact that copy-specific data submitted varies enormously from library to library. In some cases (records submitted from the Houghton Library at Harvard or the Humanities Research Center at the University of Texas for example) the data supplied is meticulous and detailed; in others it is barely sufficient to enable editors to assign the copy to a particular edition. As a compromise, between the elaborate bibliography and the traditional short-title catalogue, ESTC is no more than a first step in the development of bibliographical techniques in the computer age.
The brief guide which I have prepared as an introduction to the structure of ESTC records and a simple demonstration of the sorts of questions which can be answered, in no way does justice to the capabilities of the computer. In order to indicate how comprehensively it is possible to retrieve combinations of features, I offer you the following profile of a bibliographical record. It is divided into three sections: the first describes all features of the item which are common to the edition; the second, those features which are common to some (but not all) copies of the edition (usually described as variants); the third, those features which pertain to individual copies.
Some of these will be indexed by keyword, but numerous features of printed books lend themselves to the simplest of coding. Typographical features, for example, where a single-letter code can be assigned to various types of letter-forms and sizes. The same is true of layout and design. Bindings can be classified (admittedly on a rough and ready basis) according to material, type, period, provenance, and finishing. For fifteenth-century books a vast amount of such data already exists in various catalogues, and if consolidated into one machine-readable file would make it possible for incunabulists to study more effectively the history of book production in its infant period.
It will be obvious, to anyone who studies the on-line search guide I have distributed, that research projects in English eighteenth-century studies will rest on a fairly secure foundation of bibliographical history. By 1988, when the holdings of over 800 libraries in Europe, North America and Australasia have been entered on the file, the identified corpus of an author's work will be easily ascertainable. Chronologically arranged bibliographies, such as Adams' pamphlets on the American Revolution, will be very much easier to compile. Studies of the book trade - insofar as imprints are revealing - will be possible. The use of keyword searching to determine a first list of books on a subject where keywords are helpful will reduce considerably the time spent on scanning catalogues. Much of the labour involved in discovering where copies of books are to be found will be reduced through the comprehensive locations which ESTC will incorporate, and there is a searchable field reserved for microfilm copies. This will have two significant consequences: scholars will find it difficult to justify requests for transatlantic travel funds on the grounds of what I call the "discovery" factor; and those who depend on scholarly research will understandably expect a high standard of original, intellectual and interpretive contribution. It was to such ends that the ambitious schemes for a universal bibliography were directed a hundred years ago. And it is - almost - within grasp.
But if the advent of the computer augurs an end to much of the drudgery associated with research (the discovery of the first English grammar printed in the Netherlands in a library in Bamberg barely compensated for the discomfort of sleeping in a Volkswagen!) which represents a gain, there will be a loss. I remember something Howard Nixon said to me many years ago, when he was Superintendent of the North Library in the British Museum: "Don't be in too much of a hurry: it is only when you have worked in about a hundred libraries, and handled a few thousand books, that you begin to understand the special language with which books communicate."
Research is a highly sophisticated form of play. When we play we observe, as sociologists are continually reminding us, a subtle obedience to a grammar which is as unmistakable as it is indefinable. I am confident that the new technology will assist us in the revelation of the past. The extent of that revelation will depend, in large measure, on the contribution which all of us, whatever our preoccupations of the moment, can make.