Cross-modality sub-image retrieval using contrastive multimodal image representations

Breznik, Eva; Wetzer, Elisabeth; Lindblad, Joakim; Sladoje, Nataša

dc.contributor.author	Breznik, Eva
dc.contributor.author	Wetzer, Elisabeth
dc.contributor.author	Lindblad, Joakim
dc.contributor.author	Sladoje, Nataša
dc.date.accessioned	2024-11-11T09:42:11Z
dc.date.available	2024-11-11T09:42:11Z
dc.date.issued	2024-08-13
dc.description.abstract	In tissue characterization and cancer diagnostics, multimodal imaging has emerged as a powerful technique. Thanks to computational advances, large datasets can be exploited to discover patterns in pathologies and improve diagnosis. However, this requires efficient and scalable image retrieval methods. Cross-modality image retrieval is particularly challenging, since images of similar (or even the same) content captured by different modalities might share few common structures. We propose a new application-independent content-based image retrieval (CBIR) system for reverse (sub-)image search across modalities, which combines deep learning to generate representations (embedding the different modalities in a common space) with robust feature extraction and bag-of-words models for efficient and reliable retrieval. We illustrate its advantages through a replacement study, exploring a number of feature extractors and learned representations, as well as through comparison to recent (cross-modality) CBIR methods. For the task of (sub-)image retrieval on a (publicly available) dataset of brightfield and second harmonic generation microscopy images, the results show that our approach is superior to all tested alternatives. We discuss the shortcomings of the compared methods and observe the importance of equivariance and invariance properties of the learned representations and feature extractors in the CBIR pipeline. Code is available at: https://github.com/MIDA-group/CrossModal_ImgRetrieval.	en_US
dc.identifier.citation	Breznik, Wetzer, Lindblad, Sladoje. Cross-modality sub-image retrieval using contrastive multimodal image representations. Scientific Reports. 2024;14(1)	en_US
dc.identifier.cristinID	FRIDAID 2291677
dc.identifier.doi	10.1038/s41598-024-68800-1
dc.identifier.issn	2045-2322
dc.identifier.uri	https://hdl.handle.net/10037/35621
dc.language.iso	eng	en_US
dc.publisher	Springer Nature	en_US
dc.relation.journal	Scientific Reports
dc.relation.projectID	Norges forskningsråd: 309439	en_US
dc.rights.accessRights	openAccess	en_US
dc.rights.holder	Copyright 2024 The Author(s)	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0	en_US
dc.rights	Attribution 4.0 International (CC BY 4.0)	en_US
dc.title	Cross-modality sub-image retrieval using contrastive multimodal image representations	en_US
dc.type.version	publishedVersion	en_US
dc.type	Journal article	en_US
dc.type	Tidsskriftartikkel	en_US
dc.type	Peer reviewed	en_US

File(s) in this item

Name:: article.pdf
Size:: 2.526Mb
Format:: PDF

View/Open

This item appears in the following collection(s)

Artikler, rapporter og annet (fysikk og teknologi) [1062]

Show simple item record

Except where otherwise noted, this item's license is described as Attribution 4.0 International (CC BY 4.0)