19th March 2021

On the Style of the Novel. Algorithmic Identification of the Types of the Compound and Complex Sentences.

Botond Szemes (Eötvös Loránd University)

In my presentation, I show a method that allows automatic identification of the different types of complex and compound sentences in the Hungarian language through the analysis of conjunction words and their positions. This method opens new perspectives in the stylometric research: on the one hand, conjunctions as function words provide a large amount of data for statistical analyzes, and on the other hand, they also have meaning – about the relationships between the clauses (eg. opposition, conditionality). By examining the relative frequency of each type, it becomes possible to reveal the most characteristic relationship of the clauses in a given text or corpus. Thus, the style and poetics of the novels can be grasped at the scale of the sentence, while a topological-logical structure of the texts can also emerge, which is usually not reflected in the reading process.

In the presentation, I analyze the frequency of the conjunction types in a corpus of 100 Hungarian novels. This provides an opportunity to (1) examine which texts in comparison to the others are the most characteristic for each type; (2) to visualize the ‘internal structure’ of a text, which is based on the proportions of the types relative to each other; and (3) to register historical changes. This latter may be particularly interesting, since clear tendencies can be observed in the change of the frequency of the types over the period covered by the 100 novels (1832-2005).


