This two-day symposium aims to bring together scholars and researchers working with computational approaches to texts. The event targets a broad audience interested in the application of digital text analysis technology, as text mining, topic modeling, authorship detection, writing style analysis, text reuse, or more generally tasks performed through Natural Language Processing (NLP). These techniques have significant potential not only for the study of literature but also for the study of texts and language in general. The symposium aims to create an open forum for showcasing these techniques.

The event is also grounded in the idea that computational text analysis should be integrated not only in the academic research by faculty and their PhD students, but also in a pedagogical environment. The use of computational analysis opens up new questions in literary studies, and exposes students to many different ways of thinking about literature today.

Computer-aided literary studies still thus tend to be focused on literatures written in modern languages. NLP tools are quite developed for modern languages, especially for the modern English language. For medieval and premodern languages, due to their instability of orthographic forms, attempts to conduct computer-aided (thus, to a degree, systematic) research face many challenges to normalize and standardize their linguistic forms. Therefore, the symposium also aims to explore the use and challenge of using NLP tools for studying literatures written in underrepresented and historical languages, such as the medieval and premodern variants and precursors of Spanish, French, Latin, and Dutch. Therefore, a special focus will be on the preprocessing routines available for these texts, such as lemmatization, by which we collect inflected forms under a single item or lemma, as well as challenges faced normalizing orthographic variation of historical texts and other languages with unstable orthographies. Among the international and national speakers we will have several experts on the topic.

Our envisioned program for the symposium is as follows: On the first day, there will be several workshops, including one devoted to integrating computer-assisted analysis in the classroom, which will offer an introduction to stylometry, visualization, and text-reuse. On the second day, there will be talks (30 min) that present ongoing research projects, methodologies, and challenges. The subject languages are preferably, but not limited to underrepresented and historical languages.

We are specifically interested in receiving proposals for contributions on one or more of the following topics:

  • Stylometry for authorship studies
  • Stylometry as an approach to literary study
  • Natural Language Processing and linguistic annotation
  • Lemmatizers for underrepresented modern languages and old languages
  • Text reuse detection
  • Normalization
  • Distributional semantics
  • Network analysis
  • Text visualization

We especially welcome contributions from those working with any type of textual corpora, preferably those conceived for a specific research and/or from a diachronic perspective. We conceive this symposium as an opportunity to share (best)-practices and broaden conversation, thus proposals can be on ongoing and experimental methodologies.

Abstract submissions and format

We invite researchers to submit 500-word proposals (including footnotes but excluding the bibliography) in one single page related to any of the topics mentioned above. The format of the contributions will be 20 mins presentations followed by 10 min Q&A. Title, name(s) and affiliation should appear and the prefered formats are .txt, .docx, .odt and pdf.

Submissions must be sent to susanna_alles@miami.edu and they will be reviewed by the scientific committee.

Languages

The official language of the symposium is English, but it is possible to submit a proposal also in Spanish, French, or Italian.