Skip navigation

Automated Alignment of Medieval Text Versions based on Word Embeddings

Authors: Meinecke, Christofer
Wrisley, David Joseph
Jänicke, Stefan
Keywords: Sentence Alignment;Word Embedding;Visualization;Digital Humanities
Issue Date: 20-Mar-2020
Publisher: OSF Preprints
Citation: Meinecke, C., Wrisley, D. J., & Jänicke, S. (2020, March 20). Automated Alignment of Medieval Text Versions based on Word Embeddings.
Abstract: Medieval textuality is characterized by instability in text structure and length that varies according to the text tradition. This instability in the versions, otherwise known as “mouvance”, is characterized by dialectal difference, traces of orality, the modification of wording and even the rewriting and rearrangement of large parts of the text. To help humanities scholars in the exploratory analysis of such complex text collections, the visual analytic system iteal was initially proposed. The system aligns similar phrases on a line-level on the basis of string similarity and word n-grams. We propose an extension of this system that replaces the parameter-based approach with an automatic one using word embeddings thereby adding a semantic component. The benefit of the new visualization system is shown through a comparison of different versions of medieval French texts. Additionally, a domain-expert compared the parameter-based approach with the approach based on word embeddings to outline the similarities and differences in the alignments.
DOI: 10.31219/
Appears in Collections:David Wrisley's Collection

Files in This Item:
File Description SizeFormat 
LEVIA19_paper_6.pdfLEVIA 19 paper3.2 MBAdobe PDFView/Open

Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.