Our project studies the English metalanguage that was created to analyse and compare, appraise and classify, teach and learn the vernacular languages of Europe between 1500 and 1700, i.e. before the development of comparative philology and the institutionalisation of linguistics as an academic discipline. To this end, we will build a corpus of texts dedicated to or including observations on vernacular languages, which, in the period under review, are to be found in works with a large variety of aims and fields. Through archival research and corpus compilation, the project aims to assess the genres and text-types involved in the circulation of linguistic knowledge, and thus throw light onto unconventional texts and voices besides the major works and figures on which scholarship has naturally concentrated. The core part of our study will involve the analysis of the terminology, discursive strategies and descriptive metaphors used to discuss language in these texts, in diachronic perspective.
Our method for corpus collection combines human and computational tools to analyse available sources and make an inventory of authors and works representative of early modern linguistic metalanguage in English. For the purposes of this project, we intend to collect a meaningful corpus in the sense that it both corroborates existing scholarly knowledge about some major aspects of the evolution of linguistics and its metalanguage in English and provides new insights about facets of this evolution that have not been observed previously. For these reasons, we aim at an open corpus, the composition of which may change over time also benefitting from future external contributions. In terms of actual workflow, we will proceed as follows: the large amount of scraped information will be cleaned, simplified, and tokenised via NLTK Python libraries; subsequently, the keywords and collocations will be further consolidated, analysed and processed though lexicon extraction techniques. The corpus will be published open access to be freely queried by other researchers.
The use of such a corpus will be multifold. This tool will help raise awareness of the significance of linguistics and philology in multilingual Europe, as a way to enhance the importance of these studies for the advancement of our knowledge of a long tradition of contact, exchange and even conflict between the linguistic and cultural identities of Europe. It will be a scholarly and didactic tool, and the terminology extracted from it will provide data of interest for open source dictionaries and lexical repertoires. This study is timely and relevant as a contribution to the existing debate on the development of the discourse of the humanities as an inherently interdisciplinary field.