Title: | Exploring Gulf Manumission Documents with Word Vectors |
Authors: | Kirmizialtin, Suphan Wrisley, David Joseph |
Keywords: | Handwritten Text Recognition (HTR);word vector models (WVM);India Office Records (IOR);manumission;Gulf Studies;colonial archives;slavery |
Issue Date: | 27-Dec-2024 |
Publisher: | Brill |
Citation: | Kirmizialtin, S. and D.J. Wrisley. (2024). Exploring Gulf Manumission Documents with Word Vectors. Journal of Digital Islamicate Research 2: 1-29. |
Abstract: | In this article we analyze a corpus related to manumission and slavery in the Arabian Gulf in the late nineteenth- and early twentieth-century that we created using Handwritten Text Recognition (HTR). The corpus comes from India Office Records (IOR) R/15/1/199 File 5. Spanning the period from the 1890s to the early 1940s and composed of 977K words, it contains a variety of perspectives on manumission and slavery in the region from manumission requests to administrative documents relevant to colonial approaches to the institution of slavery. We use word2Vec with the WordVectors package in R to highlight how the method can uncover semantic relationships within historical texts, demonstrating some exploratory semantic queries, investigation of word analogies, and vector operations using the corpus content. We argue that advances in applied computer vision such as HTR are promising for historians working in colonial archives and that while our method is reproducible, there are still issues related to language representation and limitations of scale within smaller datasets. Even though HTR corpus creation is labor intensive, word vector analysis remains a powerful tool of computational analysis for corpora where HTR error is present. |
URI: | http://hdl.handle.net/2451/74850 |
DOI: | doi:10.1163/27732363-bja00005 |
Rights: | CC BY 4.0 Open Access |
Appears in Collections: | David Wrisley's Collection |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
27732363_002_01-02_s001_text.pdf | Gulf Manumission Documents | 1.18 MB | Adobe PDF | View/Open |
Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.