Advances in Digital Music Iconography

Mike Kestemont

Benchmarking the detection of musical instruments in unrestricted, non-photorealistic images from the artistic domain

In this talk, we present Minerva, the first benchmark dataset for the detection of musical instruments in non-photorealistic, unrestricted image collections from the realm of the visual arts. This effort is situated against the scholarly background of music iconography, an interdisciplinary field at the brink of musicology and art history. We benchmark a number of state-of-the-art systems for image classification and object detection. Our results demonstrate the feasibility of the task but also highlight the significant challenges which this artistic material poses to computer vision. We apply the system to an out-of-sample collection and offer a quantitative evaluation of the false positives detected. The error analysis yields a number of unexpected insights into the contextual cues that trigger the detector. The iconography surrounding children and musical instruments, for instance, shares some core properties, such as an intimacy in body language.

Authors: Mike Kestemont (presenting), Matthia Sabatelli, Nicolae Banari, Marie Cocriamont, Eva Coudyzer, Karine Lasaracina, Walter Daelemans, Pierre Geurts

Link to the video:


Zoom meeting (after the video transmission):