Editorial standards and technical implementation

The editorial approach to the edition respects the testimonies not only as an important sources of information, but also as unique textual documents. As much as possible, the documents are published in their entirety to allow their reading from start to end. Moreover, the editors also reproduced its format and materiality, including the headers and footers which made clear its function and providing a scan of the original document. Special tool was developed to generate workclouds from the encoded (TEI) documents (to be added soon).

The editors decided in favour of a detailed annotation of documents which serves two main functions: 1. It provides readers with contextual information, by referring to authority sets (EHRI terms, ghettos, camps, organizations, Geonames, etc.) and further explanations. 2. It allows to treat the documents as research data which can be reused and aggregated with other documents in the future.

Documents published in the EHRI online edition of the early Holocaust testimonies are encoded in the Text Encoding Initiative (TEI) P5 standard – a widely adopted format for digital editions. The particular TEI dialect can differ depending on the characteristics and needs of a particular edition. While allowing for flexibility, the EHRI editions, however, rely on the use of references to names, dates, places and people (TEI module namesdates).

Despite numerous existing approaches to the publication of TEI documents online, no available solution fitted the requirements of EHRI digital edition. The team therefore opted to develop its own set of tools and a front-end platform based on a simple but powerful existing open source software, Omeka, and its Neatline mapping plugin. This set of software tools is modular and can be combined or extended in different ways to fit the needs of specific editions. The requirements for the EHRI edition software followed the real-world editorial process, including the selection of documents, their transcription, annotation and translation.

EHRI online edition of early Holocaust testimonies puts emphasis on following the linked data paradigm, using links to established controlled vocabularies (EHRI for Holocaust-related entities; Geonames for geographic information, etc.). The annotation of documents, therefore, consists primarily of tagging and linking words or expressions in the documents. The annotation was done in common text editors where the entities were linked using URLs. Once the annotating and text editing was finalised, the documents were converted to the TEI XML format. The team used an open source tool Odette for this purpose and extended its stylesheet to recognise the types of entities and encode them accordingly based on the URLs used as references. The TEI files produced in this way were checked by editors and cleansed of any unwanted formatting. An EHRI-TEI-enrichment utility created normalised entries for linked entities in the TEI header (using the EHRI API, Geonames metadata and other resources) and which were later used to drive the faceted browse and map visualisations.

The resulting TEI documents are then uploaded to the Omeka web publication platform and the EHRI Editions plugin automatically populated the database based on the content of the XML file. Interactive map presentations were created based on the TEI data.