Joint MEC TEI conference 2023 — This group brings together material shared by the conference attendees.
The conference theme invites us to think about the need to encode different cultural realms — not only written musical and literary cultures, but also oral cultures, the cultures of underrepresented communities, and even cultural practices beyond language and music, such as dance, theater, and film. In coming together to identify and discuss the commonalities and differences between our two coding communities, we aim to discover new methods and new approaches to encoding culture in all its forms.
Files List
-
Bellini Digital Correspondence meets MEI
In this contribution, we show some features of the BDC digital scholarly edition along with some strategies implemented to automate, wherever possible, the process of textual encoding, analysis and normalisation, as well as some perspectives on the work in progress concerning the encoding of epistolary-related music materials in MEI.
Long paper | Encoding Cultures -
A (cautionary) tale of two texts
In this presentation, we reflect on our experiences working on two contrasting manuscripts in an institutional environment where TEI has little uptake. In particular, we explore some of the challenges and tradeoffs we encountered creating digital editions with only limited institutional support for sustainable Digital Humanities research software infrastructure and training. The first manuscript we worked with is a handwritten German text (BL) of some 100,000 words, to which we added a transcript, notes, facsimiles, and a translation. We used the TEI to encode people, places, bibliographical references, and fictional characters. This was published online using TEI Publisher Web Components and required the team (DR, RT) to create a virtual machine, build a Django interface, provision a IIIF server, provision storage for images, and maintain the site over time, all of which incurs significant technical debt and requires specialised skills. The second manuscript (NT) (https://rebrand.ly/LayardEfate) comprises 8,000 words of ethnographic notes from Vanuatu in 1914. An HTML version built by NT presents the text and images of the manuscript originals, sometimes up to eleven different page images corresponding to the same text, with decisions required to arrive at a consensus document. It is housed on a site controlled by NT, and is picked up by the Internet Archive. It requires no maintenance and has no dependencies, and NT was able to build the site himself. How can we scope projects, understand the workload implications, and manage the expectations of academics who become excited after seeing completed TEI projects and want to apply the technology in their work? What kind of ongoing support is required to keep a site like this going? While some institutions have TEI support services that can guarantee ongoing access to encoded texts, what is the best strategy for an academic who does not have access to local TEI support services that can guarantee ongoing access to encoded texts?
-
Forager Folklore Database
Encoding previously Transcribed Oral Literature: The Forager Folklore Database (FFDB)
The Forager Folklore Database (FFDB) assembles a large corpus of hunter-gatherer oral literature in English translation. It is designed as an empirical basis for the systematic study of narrative universals and the evolutionary origin of storytelling, as well as for (comparative) folklore studies.
The folktales in question were originally recorded by anthropologists and others (clergy, explorers, etc.) during the past 150 years. The FFDB, aside from generating rich metadata for each narrative indicating its provenance as well as selected textual features, will provide a corpus of narratives as digital texts encoded in TEI XML. The encoding preserves spelling and page beginnings of the original source, thus making the oftentimes hard to access material available in a citable format. Furthermore, semantic annotation is set up for animals and plants (semi-automatically through WordNet synsets), colors, person and place names as well as for narrative categories such as ana- and epimythia. This will offer researchers a more fine-grained method of text retrieval, e.g., finding all texts containing birds, and allow for richer computerized approaches in the growing field of computational folktale and narrative studies.
The form of previously transcribed orality that the folktales take comes with its own set of challenges. Though they are transmitted to us in writing, many of the relevant features of the narratives derive from their origin in oral storytelling. Conversational framings, explanatory comments, personal asides, remarks and questions from the audience, the use of gestures, onomatopoeia, and songs go beyond the generally more bookish approaches of the TEI. They require new tools, new tags, and a new understanding of textuality and authorship.
The general project infrastructure is set up dynamically in the form of a work-in-progress relational database. We use Python to generate the TEI header automatically from the data stored in the database, before combining the headers with the pre-annotated texts. Unless more restrictive copyright prohibits this, the files will be published under a CC BY-NC-SA license. The metadata for the encoded narratives are enriched further by the motif assignments in Stith Thompson’s Motif-Index of Folk-Literature (1950-58), together with additional motif assignments, e.g., by Johannes Wilbert and Karin Simoneau (Folk Literature of the South American Indians, 1970-1992) and Sigrid Schmidt (Catalogue of the Khoisan Folktales of Southern Africa, 2013). A small selection of roughly 70 narratives has been encoded already to showcase the direction and possible scope of the FFDB. -
Applied Text as Graph (ATAG)
"Applied Text as Graph" is a structured way to handle and analyze text by turning it into an interconnected graph. In this approach, every individual character in a block of text is represented by a "character node" (orange in the next figure). These nodes are connected in sequence, highlighting the linear arrangement of the text. Each block of text, whether it's a word or a paragraph, begins with a "text node" (blue) that serves as a starting point or a root element. These blocks of text can be linked to one another to suggest a reading sequence. The framework is highly detailed, focusing on the granular level of individual characters. To make referencing specific characters easier, each one is assigned a unique code, known as a UUID.
But the framework isn't just about individual characters; it also allows for the addition of annotations by "annotation nodes" (green). These annotations can provide context or explanations. The annotation node is connected to the specific characters it belongs to. Furthermore, annotations can be linked to a text nodes offering a way to add additional information about the annotation.
So, think of "Applied Text as Graph" as a dynamic, interactive map for text. It not only helps you navigate the content but also enables in-depth analysis and layered annotations.
https://git.thm.de/aksz15/atag