Background

Over the course of the last eight years the Centre for Computing in the Humanities (CCH) at King's College London has been directly involved in four projects focussed on Anglo-Saxon studies. The four projects represent a broad range of scholarly disciplines and perspectives:

At present the projects are independently managed and funded; the different academic goals of each are reflected in a variety of underlying technical approaches (PASE is database-driven whilst eSawyer is a dynamic XML publication, for example).

Apart from the obvious correlation of subject domain, all of the projects are implicitly related by a common focus on the Anglo-Saxon Charters as a collection of core primary sources. Significantly a universal system of identifiers already exists for uniquely identifying the 1875 known charters. These are known as Sawyer numbers, and were conceived by Peter Sawyer in his Annotated List and Bibliography of 1968, of which the eSawyer project is a new version. This common point of reference provides in effect a linking mechanism that is already well understood by the scholarly community. Our aim in this project is to build upon this and other points of potential interconnection to develop a new, highly interconnected hybrid resource - the Anglo-Saxon Cluster - to unlock and bring together the repositories of knowledge embodied in each of the four component resources.

One of the key premises of the Anglo-Saxon Cluster is that it should enrich and increase the visibility and utility of the component resources, without requiring them to be fundamentally altered in any way.

Aims and Objectives

The project aims are to:

  • Develop a new web-based digital resource articulated around the Anglo-Saxon charters as core material, through which the data and the corresponding metadata embodied in each of the component projects will be available together in a thematic cluster.
  • Build up an unprecedented picture of the relationships and associations which implicitly exist between the data in each component resource, but which are masked by the fact that each project has a distinct repository and interface.

The project has achieved its aims by:

  • Assessing the available and developing technological approaches to data aggregation
  • Scoping the potential for charter-based integration of the four base projects
  • Assessing user requirements, including researchers and the wider public
  • Developing models and prototypes for aggregation/integration
  • Updating all charter-related encoding to TEI P5 standard
  • Testing and disseminating the project website and tools.

Overall approach

The project takes as a starting point the body of Charter texts and the encoding model assembled for the ASChart project. The texts are in XML, originally encoded according to version P4 of the Text Encoding Initiative (TEI) guidelines; these have now been brought up to the P5 standard. The Charter encoding was initially modelled so as to represent the diplomatic discourse (i.e. mainly recording of formulas such as proem and invocation) and to refer to relevant external authority data (e.g. markup of occurrences of names and roles of persons mentioned in the charter), but, by using the TEI P5 standard, it is conceived so as to allow for extension potentially useful for encoding additional structural and semantic distinctions.

The use of TEI P5 and the ODD documentation language provides a robust framework that, once documented and published, could now be extended and adopted by others interested in the encoding of charters within and outside the TEI community.

A functional specification has been developed during the first phase of the project based on an analysis of the data available in each underlying resource. A high priority throughout the development process has been to ensure a highly iterative process of prototype interface development and feedback from the extended project team and core resource project partners (following the Agile software development methodology). This process resulted in a final functional specification and wireframe diagrams, which together form the reference documents for the remainder of the web application and interface development.

Two technical approaches were assessed for their value as the basis of designing models for aggregator systems. The first model is based on the development and integration of a Web Services client into each of the four underlying projects; the second model involves a more direct ingest process and a centralised index.

Following the assessment of the models, the first was selected as the basis for developing the prototype aggregator. Alongside the prototype, a user interface has been developed to enable complex queries to be tested and the results analysed.

In addition to this integrated search facility, a Portal allows a single point-of-entry for the constituent projects. Furthermore a draft encoding model for Anglo-Saxon charters has been produced. The goal here is to produce a single coherent model which encompasses the different representations of charters in the constituent projects. Particular attention is also paid to those components that constitute potential interconnections with data and metadata that are external to the text. Names of individuals and locations, for instance, are encoded to allow for thorough reference systems to the PASE project. Any element that contains metadata information (e.g. content of the teiHeader) has also been considered a suitable potential candidate for creating connections.

The development of the charter encoding model has taken into account the work to date of the Text Encoding Initiative (TEI), and also the international Charter Encoding Initiative (CEI), whose interests cover all types of charter from all cultures and periods, and we hope that this will contribute to the further development of that work. The web interface for the project is expressed in XHTML 1.0, and CSS level 2.1, and has been developed to comply with priority AA of WAI WCAG 1.0. Web services are described using Web Services Description Language version 2.0 and published in SOAP 1.2 encoded XML.

Project Outcomes

The project provides or will shortly provide:

  • A practical prototype system with guidance for making complex integrated searches across four resources holding related materials
  • Pointers for how to extend the scope of the systems to include other Anglo-Saxon resources, and other medieval resources
  • Indications of how fully developed systems of this kind will benefit both research and teaching
  • Indications of how an aggregated resource may be used by the wider public
  • An evidence and experience base for further research
  • A rich encoding model for the TEI and medieval charters communities