19th March 2021

On the Style of the Novel. Algorithmic Identification of the Types of the Compound and Complex Sentences.

Botond Szemes (Eötvös Loránd University)

In my presentation, I show a method that allows automatic identification of the different types of complex and compound sentences in the Hungarian language through the analysis of conjunction words and their positions. This method opens new perspectives in the stylometric research: on the one hand, conjunctions as function words provide a large amount of data for statistical analyzes, and on the other hand, they also have meaning – about the relationships between the clauses (eg. opposition, conditionality). By examining the relative frequency of each type, it becomes possible to reveal the most characteristic relationship of the clauses in a given text or corpus. Thus, the style and poetics of the novels can be grasped at the scale of the sentence, while a topological-logical structure of the texts can also emerge, which is usually not reflected in the reading process.

In the presentation, I analyze the frequency of the conjunction types in a corpus of 100 Hungarian novels. This provides an opportunity to (1) examine which texts in comparison to the others are the most characteristic for each type; (2) to visualize the ‘internal structure’ of a text, which is based on the proportions of the types relative to each other; and (3) to register historical changes. This latter may be particularly interesting, since clear tendencies can be observed in the change of the frequency of the types over the period covered by the 100 novels (1832-2005).


The meeting will take place live at Zoom at 1 pm. To participate please fill in the survey: https://forms.gle/4K1MJ7V9JW8MDKmq7 – the link to the meeting will be sent sent to the email address passed in the form.

The first part of the meeting (the lecture) will be recorded to be later uploaded to our YouTube channel. While we will only be recording the slides and speaker’s audio, we kindly ask that those of you who do not want to risk accidental sharing of your personal image turn off the cameras and turn them back on in the second part of the meeting, a discussion, which will not be recorded.