Where are the pictures? Linking photographic records across collections using fuzzy logic
Stephen Brown, UK , Simon Coupland, UK, David Croft, UK, Jethro Shell, UK, Alexander von Lünen, UK
This paper describes a novel approach to interrogating different online collections to identify potential matches between them, using fuzzy logic-based data-mining algorithms. Potentially, information about objects from one collection could be used to enrich records in another where there are overlaps. But although there is a considerable amount of bibliographic and other kinds of data on the Web that share similar information, a standardized way of structuring such data in a way that makes it easy to identify significant relationships does not yet exist. In the case of historical photographs, the challenge is further exacerbated by the enormous range of subjects depicted, and the fact that surviving records are not always complete, accurate, or consistent and the amount of text available per record is very small. Fuzzy-matching algorithms combined with semantic similarity techniques offer a way of finding potential matches between such items when standard ontology and corpus-based approaches are inadequate, in this case helping researchers for the first time match photographs held in different archives to historical exhibition catalogue records.