AuthorsF. Murabito, C. Spampinato, S. Palazzo, D. Giordano, K. Pogorelov and M. Riegler
TitleTop-Down Saliency Detection Driven by Visual Classification
AfilliationCommunication Systems
Project(s)Efficient EONS: Execution of Large Workloads on Elastic Heterogeneous Resources, Department of Holistic Systems
Publication TypeJournal Article
Year of Publication2018
JournalComputer Vision and Image Understanding

This paper presents an approach for saliency detection able to emulate the integration of the top-down (task-controlled) and bottom-up (sensory information) processes involved in human visual attention. In particular, we first learn how to generate saliency when a specific visual task has to be accomplished. Afterwards, we investigate if and to what extent
the learned saliency maps can support visual classification in nontrivial cases. To achieve this, we propose SalClass-
Net, a CNN framework consisting of two networks jointly trained: a) the first one computing top-down saliency maps
from input images, and b) the second one exploiting the computed saliency maps for visual classification.
To test our approach, we collected a dataset of eye-gaze maps, using a Tobii T60 eye tracker, by asking several subjects
to look at images from the Stanford Dogs dataset, with the objective of distinguishing dog breeds.
Performance analysis on our dataset and other saliency benchmarking datasets, such as POET, showed that Sal-
ClassNet outperforms state-of-the-art saliency detectors, such as SalNet and SALICON. Finally, we also analyzed
the performance of SalClassNet in a fine-grained recognition task and found out that it yields enhanced classification
accuracy compared to Inception and VGG-19 classifiers. The achieved results, thus, demonstrate that 1) condition-
ing saliency detectors with object classes reaches state-of-the-art performance, and 2) explicitly providing top-down
saliency maps to visual classifiers enhances accuracy.

Citation Key25860