Domain Adaptation and NBNN
Over the last years, several authors have signaled that state of the art categorization methods fail to perform well when trained and tested on data from different databases. The general consensus in the literature is that this issue, known as domain adaptation and/or dataset bias, is due to a distribution mismatch between data collections.
But what if broadly adopted features and classifiers would be part of the problem, rather than good ingredients for its solution?
Since the seminal work of Boiman et al., the Naive Bayes Nearest Neighbor (NBNN) method has challenged (i) the vector quantization step in Bag of Words (BOW) descriptors, that allows to have a compact feature representation to the expenses of its informative content, and (ii) the computation of image-to-image distances, that enables to use kernel based classification methods, but that does not generalize much beyond the labelled images.
Even though the domain adaptation/dataset bias problem is clearly at its core a generalization problem, the almost totality of approaches presented so far use image-to-image learning algorithms on top of BOW representations.
Domain Adaptive NBNN
Our main intuition is that the distribution of local features per class may be similar across two domains despite the variation between the respective image distributions. This similarity can also be enhanced by a domain adaptation approach in the NBNN setting.
With this goal in mind, we start from the metric learning method proposed in [1] and we extend it to deal with two domains. Inspired by [2] we propose a greedy algorithm which progressively selects an increasing number of target instances and combines it with a subset of the source data while learning iteratively a Mahalanobis metric per class.
[1] Z. Wang, Y. Hu, and L.-T. Chia. Image-to-class distance metric learning for image classification. In ECCV, 2010.
[2] L. Bruzzone and M. Marconcini. Domain adaptation prob- lems: A DASVM classification technique and a circular val- idation strategy. IEEE PAMI, 32(5):770–787, 2010.
Source Code
The software for the DA-NBNN algorithm described in our ICCV paper can be downloaded below. The demo runs the experiment on a subset of the Office+Caltech dataset as in section 5.2 of the paper (and Figure 3) and produces results for:
-
BOW-Nearest Neighbor
-
NBNN
-
DA-NBNN
All scripts are implemented in MATLAB. The code has been tested under Linux environment using MATLAB 8.1.0.604 (R2013a).
>> DANBNNdemo.tar.gz (72.1 Mb)
To use this software, please cite the following paper
@inproceedings{TommasiICCV2013,
author = {Tatiana Tommasi, Barbara Caputo},
title = {Frustratingly Easy NBNN Domain Adaptation},
booktitle = {ICCV},year = {2013}}