Abstract: Ambiguous images are widely recognized as a valuable tool for probing human perception. Perceptual biases that arise when people make judgements about ambiguous images reveal their expectations about the environment. While perceptual biases in early visual processing have been well established, their existence in higher-level vision has been explored only for faces, which may be processed differently from other objects. Here we developed a new, highly versatile method of creating ambiguous hybrid images comprising two component objects belonging to distinct categories. We used these hybrids to measure perceptual biases in object classification and found that images of man-made (manufactured) objects dominated those of naturally occurring (non-man-made) ones in hybrids. This dominance generalized to a broad range of object categories, persisted when the horizontal and vertical elements that dominate man-made objects were removed and increased with the real-world size of the manufactured object. Our findings show for the first time that people have perceptual biases to see man-made objects and suggest that extended exposure to manufactured environments in our urban-living participants has changed the way that they see the world.
3. Discussion
We examined biases in people's classification of different types of natural images. In experiment 1, we found that when an ambiguous hybrid image was formed of structures from two different image categories, classification was biased towards the man-made categories (houses and vehicles) rather than towards the non-man-made categories (animals and flowers). This ‘man-made bias’ is not a bias towards any specific spatial frequency content. Additional experiments (see electronic supplementary material, §S5) revealed that the bias is (1) common across urban-living participants in different countries, and (2) not simply a response bias. The results of experiment 2 replicated and extended the results of experiment 1 to demonstrate that the bias was affected by the real-world size of man-made objects (but not animal size), with a stronger bias for larger man-made objects. Reduced biases for small man-made objects may be explained by shared feature statistics (e.g. curvature) between small (but not large) man-made objects and both small and large animals [22]. However, we highlight that the bias is not only for larger man-made objects, because we still obtained man-made biases even when small man-made objects were paired with animals. We propose that this man-made bias is the result of expectations about the world that favour the rapid interpretation of complex images as man-made. Given that the visual diet of our urban participants is rich in man-made objects, our results are consistent with a Bayesian formulation of perceptual biases whereby ambiguous stimuli result in biases towards frequently occurring attributes [5].
We stress that the man-made bias is not merely a manifestation of the relative insensitivity to tilted (i.e. neither vertical nor horizontal) contours, commonly known as the ‘oblique effect’ [23,24]. Our participants exhibited biases in favour of man-made objects even when cardinal orientations had been filtered out of them. This occurred despite the fact that the power spectra of houses and vehicles were largely dominated by cardinal orientations, whereas those of animals and flowers were largely isotropic (electronic supplementary material, §S6 and figure S6). Whereas the oblique effect was established using narrow-band luminance gratings on otherwise uniform backgrounds, it cannot be expected to influence the perception of broad-band, natural images, such as those used in our experiments. Indeed, if anything, detection thresholds for cardinally oriented structure tend to be higher than those for tilted structure, when those structures are superimposed against broad-band masking stimuli [25].
We note however that we do not claim that intercardinal filtering removes all easily detectable structures from the images in man-made categories. Indeed, houses and vehicles almost certainly contain longer, straighter and/or more rectilinear contours than flowers and animals. Therefore, we also performed a detection experiment to examine if increased sensitivity to structural features that might dominate man-made categories could account for the man-made biases by measuring detection thresholds (see electronic supplementary material, §S7). It revealed that houses and vehicles did not have lower detection thresholds (i.e. the minimum root mean square contrast required to reliably detect images from each category) than images from the non-man-made categories. This finding provides strong ammunition against any sensitivity-based model of the man-made bias. Whatever structure is contained in the unfiltered images of houses and vehicles, that structure proved to be, on average, no easier to detect than the structure contained in unfiltered images of animals and flowers.
The lack of a bias for animals and a difference in sensitivity between image categories appears to contradict past findings from Crouzet et al. [15], who report that the detection of animals precedes that of vehicles using a saccadic choice task. However, comparing contrast sensitivity (detection) to saccadic reaction (decision) is problematic, especially with high contrast stimuli [26]. Secondly, the difference could be attributed to the background of images that must be classified. While Crouzet et al. [15] controlled contextual masking effects on image category by presenting images occurring in both man-made and natural contexts, our images in the detection experiment were embedded in white noise with the same amplitude spectrum as the image (electronic supplementary material, figure S7). As Hansen & Loschky [27] report, the type of mask used (e.g. using a mask sharing only the amplitude spectrum with the image versus one sharing both amplitude and phase information with the image) affects masking strength. It is still unclear which type of masks work best across different image categories [27].
Although we carefully controlled the spatial frequency content of our stimuli in experiments 1 and 2, it is conceivable that the bias towards man-made objects arises at a level intermediate between the visual system's extraction of these low-level features and its classification of stimuli into semantic categories. To investigate whether any known ‘mid-level’ features might be responsible for the bias towards man-made objects, we repeated experiments 1 and 2 with HMAX, a computer-based image classifier developed on the basis of the neural computations mediating object recognition in the ventral stream of the visual cortex [28,29], allowing it to exploit mid-level visual features in its decision processes (see electronic supplementary material, §§S4 and S10). We also classified hybrids from experiment 2 with the AlexNet Deep Convolutional Neural Network (DNN), which could potentially capture more mid-level features [30] (see electronic supplementary material, §S9). Results indicate that human observers' bias for man-made images seems not to be a simple function of the lower and mid-level features exploited by conventional image-classification techniques.
However, we must concede that HMAX and AlexNet do not account for all possible intermediate feature differences between object categories, for instance 3D viewpoint [31]. If we are frequently exposed to different viewpoints of man-made but not non-man-made objects, this might lead to a man-made bias too. Therefore, more experiments where categorical biases can be measured after equating object categories for intermediate features are needed to pinpoint the level at which the man-made bias occurs. Indeed, the bias for man-made objects might have nothing to do with visual features at all. It may stem from (non-visual) expectations that exploit regularities of the visual environment [6]. To be clear: we are speculating that the preponderance of man-made objects in the environment of urban participants could bias their perception such that it becomes efficient at processing these types of stimuli.
When might such a bias develop? Categorical concepts and dedicated neural mechanisms for specific object categories seem to develop after birth, with exposure [32–34]. This suggests that expectations for object categories are likely to develop with exposure too. However, if expectations occur at the level of higher-level features associated with object categories, we cannot discount the possibility that expectations may be innate. For instance, prior expectations for low-level orientation has been attributed to a hardwired non-uniformity in orientation preference of V1 neurons [6]. Similarly, we may have inhomogeneous neural mechanisms for higher-level features too. Recently identified neural mechanisms selectively encoding higher-level features of objects (e.g. uprightness [35]) add to this speculation. It remains to be determined when and how man-made biases arise and whether they are adaptable to changes in the environment. Further, the perceptual bias that we demonstrate may be altered by testing conditions, which limit its generalizability. For instance, low spatial frequency precedence in image classification is altered by the type of classification that must be performed (e.g. classifying face hybrids for its gender versus expression) [36].
No comments:
Post a Comment