Platform for case-study based research and benchmarking

Posted:

To what extent is the Information, Library and Archival Science community taking Machine Learning (ML) at face value ? ML methods and tools for the classification, clustering and indexing of documents have demonstrated tangible results within cultural heritage institutions. As traditional indexing and cataloging practices are increasingly pushed into a corner, reducing them to the luxury domain of “boutique metadata”, the rise of ML techniques to face operational challenges has been welcomed. However, how do we as a research community assess the quality of the outcomes of both supervised and unsupervised ML ?

This website wishes to offer a platform to share case-study based research within the Information, Library and Archival Science community. As underlined by for example André Vellino (2016) and Anne J. Gilliland (2017), there is an increasing need to work with dirty” real-life document sets instead of sterile, cleaned and marked up test sets which are traditionally used for benchmarking purposes. Over the next few months, this website will aggregate a set of varying case-studies, for which both the research paper, the method and tools alongside the source data will be made available as open data.

Subscribe via RSS