Automated Alignment of Medieval Text Versions based on Word Embeddings
Wrisley, David Joseph
|Sentence Alignment;Word Embedding;Visualization;Digital Humanities
|Meinecke, C., Wrisley, D. J., & Jänicke, S. (2020, March 20). Automated Alignment of Medieval Text Versions based on Word Embeddings. https://doi.org/10.31219/osf.io/tah3y
|Medieval textuality is characterized by instability in text structure and length that varies according to the text tradition. This instability in the versions, otherwise known as “mouvance”, is characterized by dialectal difference, traces of orality, the modification of wording and even the rewriting and rearrangement of large parts of the text. To help humanities scholars in the exploratory analysis of such complex text collections, the visual analytic system iteal was initially proposed. The system aligns similar phrases on a line-level on the basis of string similarity and word n-grams. We propose an extension of this system that replaces the parameter-based approach with an automatic one using word embeddings thereby adding a semantic component. The benefit of the new visualization system is shown through a comparison of different versions of medieval French texts. Additionally, a domain-expert compared the parameter-based approach with the approach based on word embeddings to outline the similarities and differences in the alignments.
|Appears in Collections:
|David Wrisley's Collection
Files in This Item:
|LEVIA 19 paper
Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.