Recreating the Network of Early Modern Natural Philosophy: A Mono- and Multilingual Text Data Vectorization Method
Recreating the Network of Early Modern Natural Philosophy: A Mono- and Multilingual Text Data Vectorization Method
Blog Article
How could one create a network representation of a book corpus which spans over two hundred years ? In this paper, we present a method based on text data vectorization for a complex and multifaceted network representation of an early modern corpus of 239 natural philosophy textbooks published in Latin, French, and English.We use unsupervised methods (namely, topic modeling, term frequency – inverse document frequency, and multilingual word embeddings) to represent the broader features of this corpus, such as its homogeneity in style and linguistic usages, both among works written in G-Spot Vibrators the same language, and across multiple languages.We call this the ‘textual dimension.’ We also use a collocate analysis of specific keywords to explore how certain concepts were understood, reshaped, and disseminated in the corpus.We call this the ‘semantic dimension.
’ Each of these two dimensions provides a different way of correlating the books via text data vectorization and of representing them as a network.Since these dimensions are complex and multifaceted, the network we construct for each of them is a multiplex, made from several layer-graphs.Furthermore, using existing bio-bibliographical information, this research provides the grounds for further expanding the described network representation in such a way as to create a third multiplex, one that explores some Bucket hats of the social features of the authors in question.