Probabilistic Topic Model for Context-Driven Visual Attention Understanding

Modern computer vision techniques have to dealwith vast amounts of visual data, which implies a computationaleffort that has often to be accomplished in broad and challenging scenarios. The interest in efficiently solving these imageand video applications has led researchers to develop methodsto expertly drive the corresponding processing to conspicuousregions that either depend on the context or are based on specificrequirements. In this paper, we propose a general hierarchicalprobabilistic framework, independent of the application scenario,and relied on the most outstanding psychological studies aboutattention and eye movements which support that guidance isnot based directly on the information provided by early visualprocesses but on a contextual representation that arose fromthem. The approach defines the task of context-driven visualattention as a mixture of latent sub-tasks, which are, in turn,modeled as a combination of specific distributions associatedto low-, mid-, and high-level spatio-temporal features. Learningfrom fixations gathered from human observers, we incorporatean intermediate level between feature extraction and visualattention estimation that enables to obtain comprehensivelyguiding representations. The experiments show how our proposalsuccessfully learns particularly adapted hierarchical explanationsof visual attention in diverse video genres, outperforming severalleading models in the literature.

Probabilistic Topic Model for Context-Driven Visual Attention Understanding Articles