Efficiency of Visual Statistics

Lead Research Organisation: City, University of London
Department Name: School of Health Sciences

Abstract

The merest glance is usually sufficient for an observer to get the gist of a scene. That is because the visual system statistically summarises its input. For example, the observer may infer a forest scene if there are significantly more than 20 trees in the picture. If most of those trees aren't within a few degrees of vertical, the observer may infer that the camera was tilted. Our interest is primarily the efficiencies with which statistics like these are calculated. For example, how many trees can the observer use to estimate camera angle? Certain pictorial details are also inaccessible when observers do not look directly at an object. Like brief glimpses, our peripheral vision is often merely statistical. We will compare the efficiencies with which central and peripheral vision are capable of computing various image statistics, like orientation, position, and size. We will also investigate claims of greater efficiency for noticing differences in humanoid images. In fact, we now know that the visual system tends to exaggerate image differences. Happy or dark faces can appear less happy or dark, when surrounded by even happier or darker ones. We will test how well the amount of exaggeration can be predicted by visibility of the difference. To insulate our results from bias, we will focus on pictorial differences that cannot be so easily described.

Planned Impact

This proposal is a collaboration between City University London, Max-Planck Koln and UC-Irvine. Our research will be disseminated in the usual way: the PI's and postdoc will attend one international conference each year, and present their work. All three institutions have PR departments which can help generate buzz. The postdoc should certainly enjoy his/herself working in the Solomon/Morgan lab. And of course we would attempt to further a good postdoc's career in any way possible. At the very least, our research will be of interest to the other vision scientists (e.g. Roger Watt and Steve Dakin) who have focussed on orientation statistics. Naturally, we hope that the more cognitively oriented psychologists who have investigated the visual statistics of size (e.g. Dan Arielly, Anne Treisman and Dan Simons) will adopt what we feel is a more principled approach to the measurement of statistical efficiency. If our intuition regarding crowding (i.e. that it reflects an obligatory statistical analysis with high efficiency) proves correct, then our work may influence the much larger group of scientists currently trying to understand that topic. Our other major, as-yet-untested intuition concerns the relationship between salience and gain control. Both of these topics have come under intense scrutiny by psychologists, neuroscientists and computational modellers. Our grandest ambition is that our research may inspire parallel extensions in all of these fields. Vision Science has a long history of benefitting health care, but it is hard to judge the specific implications of our research and when they may be realised. That is the nature of basic research; its benefits are often unforeseen. For example, Solomon's wavelet sensitivity functions (Watson et al 1997) surprisingly formed the basis of many digital watermarks and Chubb et al's (1989) contrast-contrast exaggeration surprisingly has the potential to separate schizophrenics from normal observers (Dakin et al 2005). Of course, the other side of the coin is that harm is also often unforeseen. Nonetheless, we can be reasonably confident our research will not directly benefit any terrorism or other evil.
 
Description In a series of six papers, Solomon and colleagues investigated human observers' ability to form statistical summaries from multiple sources of visual information. The statistical summary investigated by Morgan et al (2012) was positional variance. A random adjustment was applied to the position of each dot in an otherwise regular geometric pattern. Observers were asked to compare two such patterns and select the one with the larger positional variance. The results clearly demonstrated that human observers considered the positions of more than one dot in each pattern. However, an ideal observer could have performed just as well with as few as 5 or 6 dots.

Visual estimates of average size are limited to a similar number of items (Solomon et al 2012 used circles). Gorea et al (2014) pursued this line of research, and found that the sizes of 5 or 6 circles could be estimated simultaneously, almost as soon as the circles became visible. This latter finding suggests some hard-wired neural circuit, capable of making multiple size estimates in parallel.

Also consistent with the idea of a hard-wired neural circuit is the finding of an after-effect of perceived regularity (Ouhnana et al 2013): The apparent positional variance in any dot array will be reduced after staring at other dot arrays with higher variance.

Inconsistent with the idea of a hard-wired neural circuit are results from our investigations of orientation averaging. Observers asked to compare the mean orientations in two sets of oriented elements effectively ignore all but 2 or 3 elements in each set (Solomon 2010). When viewing time is limited, the average observer effectively ignores all but 2 (Solomon et al 2016).

In the aforementioned six papers, proposals are made for the computations performed by the neural circuits putatively responsible for regularity discrimination and size averaging. However, given human observers' particularly low efficiency, there probably is no neural circuit devoted to orientation averaging.

Finally, in a series of three papers, May and Solomon (2013, 2015a, 2015b) describe the mathematical relationship between neural activity and the consistencies with which human observers can a) detect faint visual stimuli and b) discriminate between stimuli of almost identical contrast.
Exploitation Route Here are three ways our findings have already been used:
1) Kompaniez, Abbey, Boone, and Webster cited the after-effect of regularity as inspiration in their search for an after-effect of viewing dense radiological images. (They found one. Subsequently viewed images appear less dense.)
2) Protonotarios, Baum et al (J. R. Soc. Interface 2014) applied a modified version of our model for positional-variance encoding to the distribution of bristle cells on developing fruit flies. The aforementioned variance decreases as the flies age. If that isn't a prime example of the unexpected application of basic research, then I don't know what is!
3) Groups as diverse as Hu, Lesmes et al (J. Vis. 2015) who developed a protocol for rapidly assessing contrast sensitivity in clinical populations, Stirman, Townsend, and Smith (Vis. Res. 2016) who developed a touchscreen-based method for assesing motion perception in mice, and Hathibelagal, Feigl et al (JOSA A, 2016) who measured rod-cell signalling have all used May & Solomon's (2013) 'Four Theorems' to fine-tune their psychophysical methods.
Sectors Pharmaceuticals and Medical Biotechnology