3. Abstracts

Below is a list of published conference abstracts, including forthcoming presentations.


  • Bruno, A., Segala, F., Smith, I. & Baker, D.H. (2020). Adaptation to visual motion can differentiate between perceptual timing and interval timing. Submitted to VSS 2020.
  • Heywood-Everett, E., Baker, D.H. & Hartley, T. (2020). Testing spatial memory precision thresholds in a view rotation task. Submitted to VSS 2020.

Published (77)

  • Lygo, F.A., Richard, B. & Baker, D.H. (2019). Multimodal neuroimaging of interocular contrast responses in human amblyopia. Perception, 48(S1): 51.

In amblyopia, neural responses from the amblyopic eye are attenuated compared to fellow eye responses. Interestingly, attenuation is greater when measured with steady-state visual-evoked potentials (SSVEPs) than functional magnetic resonance imaging (fMRI) measurements. We directly compared these two techniques by measuring interocular responses of amblyopic participants (N = 12) and controls (N = 10) to sinusoidal gratings (4 Hz flicker) of different contrasts (0%, 1.5%, 6%, 24%, and 96%), factorially combined across the two eyes. We found a deficit in the amblyopic eye response, which differed between SSVEPs and fMRI in a manner similar to previous reports. We explored different computational models of binocular vision in amblyopia to explain these responses. An attenuation model provides adequate fits for both data sets, but a shift in contrast gain improves model fits for SSVEP data. This suggests subtle architectural differences for data derived from different modalities.

Gist is a series of characteristics humans extract rapidly to make judgments about the content and nature of a scene. This quick extraction of information happens for multiple image categories at the same time, but these outputs can show either probability summation or destructive interference depending on the task contingencies at hand (Evans et al., 2011). We explored the dynamics of gist processing and modulation of neuronal response due to changing task contingencies across three rapid event-related experiments. We used both univariate cluster-corrected comparison of ERPs and multivariate pattern analysis of electroencephalographic responses across the scalp. In the experiments, observers were asked to categorize briefly presented (25 ms) pre-cued images from different categories. In experiments 1 & 2 we examined conflict arising when both the cued target and task irrelevant but primed target are present. In experiment 3 we expanded the number of categories and introduced additional task contingencies (find either both or either of the two targets). Findings show that the gist of an image is discriminable from 50 ms post stimulus onset. Onset of different patterns of responses to different target categories is modulated by task relevance. When image categories are potential targets the differentiating pattern for diverse categories arises at the same time as target detection (50-100 ms post stimulus). However, non-target image categories are differentiated later, 100 ms after the target gist is detected. Lastly, changes in task contingency influence differentially the pattern of EEG responses, but only from around 300 ms post stimulus onset. In conclusion, we are able to differentiate between gists as soon as we detect the presence of the cued target but this differentiation is delayed when the categories are not task relevant. The effects of task contingencies modulate rapid gist processing, but only at the decisional stage.

  • Baker, D.H., Lygo, F.A., Meese, T.S. & Georgeson, M.A. (2018). Binocular summation: a meta analysis of 65 studies. iPerception, 9(IS): 1.

Binocular summation is the advantage in contrast sensitivity when using two eyes versus one. It has been widely studied owing to its clinical importance as a measure of binocular function, and because the precise level of summation is determined by the magnitude of nonlinearities in the early visual system, before binocular combination. However, most studies have involved small sample sizes, making exact estimation problematic. We conducted a meta-analysis of 65 studies reporting psychophysical estimates of binocular summation in 716 observers. The lower bound of the 95% confidence interval on the mean summation ratio was consistently above the canonical value of √2, regardless of how studies were weighted. We further explored how methodological factors affect summation estimates, both by using subsets of the meta-analysis data and also confirming with stand-alone studies. These analyses show that stimulus factors such as spatiotemporal frequency affect summation, and that the imbalance in sensitivity across the eyes can moderate summation estimates. We suggest that there is no single canonical value for binocular summation, but that instead it takes on a range of values between approximately √2 and 2, depending on stimulus properties. In addition, when the two eyes are not balanced, summation estimates are reduced when calculated relative to the threshold of the more sensitive eye, but can be slightly elevated when the mean monocular threshold is used. Future studies can obtain accurate summation estimates by normalising monocular contrasts to account for sensitivity differences or by modelling results using a simple two-parameter model of binocular combination.

  • Spencer, L., Wade, A.R., Baker, D.H. & Evans, K.K. (2017). Neuronal and temporal correlates of “Gist” processing. Journal of Vision, 17(10): 523.

Humans can rapidly extract the ‘gist’ of images, using global image and summary statistics. This allows for quick extraction of information for multiple categories, but these outputs can interfere destructively depending on the task at hand (Evans et al., 2011). Using rapid event-related fMRI and EEG in two experiments, we investigated the neuronal correlates of gist processing and their modulation due to changing task contingencies. In the fMRI experiment a combination of noise masks and two different category images were presented in quadrants of the visual field simultaneously for 200 ms. Observers reported the presence and quadrant of a pre-defined target category. We measured BOLD responses in pre-localised, category selective cortical regions and conducted additional whole-brain analyses. In the EEG experiment observers were also asked to categorize briefly presented (25 ms) pre-cued images from six categories. Multivariate pattern analysis (MVPA) of EEG responses was used to identify patterns of activity across the scalp. FMRI results show category-selective activation in extrastiate areas, supporting their involvement in gist perception, and EEG data revealed that gist is discriminable from 50 ms post stimulus onset. No top-down-driven activation in target locations was observed in early visual cortex, consistent with the observation of gist extraction without the ability to localize the target. Responses to changes in the image category task contingencies during the experiment were evident only in frontal areas. Consistent with this, changes in task contingency influenced the pattern of EEG responses only from around 300 ms post stimulus onset. In conclusion, we find that activity in category specific extrastriate visual areas correlates with spatially non-specific, rapid gist perception and that these areas presumably pool signals from earlier areas with lower featural selectivity. Lastly, the effects of task contingencies modulate this rapid gist processing only at the decisional stage.

Transcranial magnetic stimulation (TMS) is often used to link behaviour to anatomy by targeting a brain area during an associated task. Decreases in performance on that task are often explained as a suppression of stimulus-driven signals, but could also be explained by increases in neural noise. This study used a 2IFC double-pass contrast discrimination paradigm (Burgess & Colborne, 1988, J Opt Soc Am A, 5:617-627) to distinguish between these two possibilities in four types of TMS: online single-pulse (spTMS), online three-pulse repetitive (rTMS), offline continuous (cTBS) and intermittent theta burst stimulation (iTBS). Using standard stimulation protocols with a Magstim Super Rapid2, online TMS was applied to early visual cortex 50ms after onset of each stimulus in each interval, and offline TBS was applied before the start of the task. On each trial (200 total) two grating stimuli of random contrast were presented peripherally (position determined by phosphene localization). Half of the trials contained a 4% contrast increment in one of the intervals. The exact same trial sequence was then repeated with randomized interval order (second pass). A decrease in accuracy in the 4% target condition would indicate signal suppression whereas a reduction in consistency of responses between the two passes would indicate an increase in neural noise. Mean accuracy and consistency scores were bootstrapped within participants. It was found that spTMS reduced accuracy whereas rTMS decreased consistency. This implies that spTMS decreases signal strength whilst rTMS increases neural noise without affecting the stimulus-driven signal. Offline stimulation (cTBS, iTBS) did not affect accuracy or consistency. This is the first study to compare several types of TMS using a single paradigm that can dissociate noise from suppression. These findings can explain inconsistencies in results between previous studies using different TMS protocols and so comparisons across protocols should be made with caution.

  • Coggan, D.D., Watson, D.M., Hartley, T., Baker, D.H. & Andrews, T.J. (2017). A data-driven approach to stimulus selection reveals the importance of visual properties in the neural representation of objects. Journal of Vision, 17(10): 29.

The neural representation of objects in the ventral visual pathway has been linked to high-level properties of the stimulus, such as semantic or categorical information. However, the extent to which patterns of neural response in these regions reflect more basic underlying principles is unclear. One problem is that existing studies generally employ stimulus conditions chosen by the experimenter, potentially obscuring the contribution of more basic stimulus dimensions. To address this issue, we used a data-driven analysis to describe a large database of objects in terms of their visual properties (spatial frequency, orientation, location). Clustering algorithms were then used to select images from distinct regions of this feature space. Images in each cluster did not clearly correspond to typical object categories. Nevertheless, they elicited distinct patterns of response in the ventral stream. Moreover, the similarity of the neural response across different clusters could be predicted by the similarity in image properties, but not by the similarity in semantic properties. These findings provide an image-based explanation for the emergence of higher-level representations of objects in the ventral visual pathway.

The visual system must combine information from the two eyes into a single binocular percept. Several psychophysical studies have converged on a gain control model (Meese, Georgeson, & Baker, 2006, Journal of Vision, 6, 1224–1243) that accurately describes contrast discrimination and matching data for any combination of contrasts in the left and right eyes. This model makes explicit quantitative predictions about how information from the two eyes is encoded by neurons in early visual cortex. These predictions were tested using a steady-state visual evoked potential paradigm and described data from six conditions better than five alternative algorithms. The model can be considered a biological implementation of a Kalman filter, which is the statistically optimal algorithm for combining two noisy inputs. We next consider how the anatomical implementation of binocular vision determines perception and neural response. We measured evoked responses to simple stimuli presented in different ocular combinations. A support vector machine algorithm was able to use the pattern of voltages across the scalp to decode eye of presentation (left vs. right) and discriminates between monocular and binocular presentation, between ∼100 and 300 ms following stimulus onset. However, observers performed at chance for reporting eye-of-origin and were relatively poor at discriminating monocular versus binocular presentation. Information encoded structurally in early visual areas (i.e., in ocular dominance columns) therefore produces identifiable patterns of electrical response but is lost to perception at higher levels of processing. Taken together, these studies demonstrate how Marr’s “three levels”—computational theory, algorithm, and implementation—provide a framework for studying binocular vision.

  • Vilidaitė, G. & Baker, D.H. (2016). Predicting perception from the electroencephalogram. Perception, 45(S): 229.

Internal noise in sensory systems limits processing by corrupting the fidelity of neural signals. We asked when neural activity could predict perceptual decisions by using multivariate analysis of event related potentials (ERPs). Observers saw two 1 c/deg sine-wave gratings and reported the interval that appeared to have the higher contrast on each trial, while EEG activity was recorded at 64 scalp locations. One interval contained a pedestal grating with a contrast of 50%, the other contained the pedestal plus a target increment (0-16% contrast). We used a support vector machine algorithm to classify observer decisions by training on the selected/non-selected interval. When the stimuli were physically identical (0% target increment), classification was above chance in periods around 100 ms, 250 ms and 400 ms for predicting the observer’s decisions. This may reflect the influence of internal noise at different stages of processing. We also trained the algorithm on the observer’s decisions in the 16% target contrast condition (where psychophysical performance was near ceiling) and then tested on the 0% target trials. Classification was above chance at around 100 ms and 200 ms, reflecting early components of the neural response. Future work will aim to resolve the anatomical locus of noise at different stages of decision making.

  • Meese, T.S. & Baker, D.H. (2016). Grid-texture mechanisms in human vision: contrast detection of regular sparse micro-patterns requires specialist templates. Perception, 45(S): 358.

Previous work has shown that human vision performs spatial integration of luminance contrast energy, where signals are squared and summed (with internal noise) over area at detection threshold. We tested that model here in an experiment using arrays of micro-pattern textures that varied in overall stimulus area and sparseness of their target elements, where the contrast of each element was normalised for sensitivity across the visual field. We found a power-law improvement in performance with stimulus area, and a decrease in sensitivity with sparseness, and rejected a model involving probability summation across all elements. While the contrast integrator model performed well when target elements constituted 50-100% of the target area (replicating previous results), observers outperformed the model when texture elements were sparser than this. This result required the inclusion of further templates in our model, selective for various regular texture densities. By assuming probability summation across these mechanisms the model also accounted for the increase in the slope of the psychometric function that occurred as texture density decreased. Thus, we have revealed texture density mechanisms for the first time in human vision at contrast detection threshold (where the fitted level of intrinsic uncertainty was low and the only free parameter).

  • Benjamin, A.V., Wailes-Newson, K., Baker, D.H. & Wade, A. (2016). No evidence for a locomotion-induced change in human surround suppression. Perception, 45(S): 324.

Electrophysiological studies in rodents have shown that visual processing is modulated by the locomotion of the animal. (Niell & Stryker, 2010). In 2013, Ayaz et al., reported a surprising reduction in surround suppression (SS) in mice running on an air-cushioned ball. Here we used psychophysics to ask whether a similar effect is evident in humans. We measured contrast discrimination thresholds for Gabor patches at 5 different pedestal levels and three surround conditions (‘no surround’, ‘high contrast parallel grating’, ‘high contrast orthogonal grating’). Data were measured while subjects were a) standing still, or b) walking briskly (7 km/h) on a treadmill. We measured orientation-tuned SS in both conditions. However, the magnitude of SS increased rather than decreased in the locomotion condition. A further experiment indicated that one potential explanation for this increase was image blurring due to head movements during locomotion. We explore the possibility that the reduction in SS measured with electrophysiology in head-fixed rodents reflects a general reduction in contrast masking during locomotion that might serve to maintain edge contrast in jittered visual scenes.

  • Baker, D.H. & Richard, B. (2016). A dynamic double pass technique for characterizing internal noise during binocular rivalry. Journal of Vision, 16(12): 1324.

The perceptual alternations characteristic of binocular rivalry have a stochastic component that is a consequence of internal noise in the visual system. Yet little is known about the properties of this noise, as we lack methods to probe it directly. We used a standard binocular rivalry monitoring paradigm, in which observers viewed a pair of dichoptic orthogonal gratings (2c/deg, 50% contrast) for trials of one minute duration, and reported their percepts continuously using a mouse. We then injected dynamic noise into the stimulus by adding independent sequences of temporally bandpass-filtered noise to the contrasts of the two gratings. The peak temporal frequency (0.0625 to 1Hz) and standard deviation (1 to 16%) of the noise were combined factorially giving 25 conditions. Dominance durations and autocorrelation functions calculated across five repetitions per observer showed effects of both noise variance and frequency, indicating that alternations were driven by high amplitude external noise at all temporal scales. We then repeated the experiment using the same samples of noise for each condition, and calculated the consistency of the observer’s responses across the two passes, in a novel dynamic variant of the ‘double pass’ method (Burgess & Colborne, 1988, J Opt Soc Am A, 5: 617-627). Consistency increased from baseline (50%) at noise standard deviations of 4% and above, providing an approximate estimate of the amplitude of internal noise. Consistency scores showed bandpass tuning, with the highest consistency (around 70%) occurring at 0.125Hz, falling off at higher and lower frequencies. Using anticorrelated noise across the eyes (rather than uncorrelated noise) produced a slight increase in consistency to around 75%, providing an upper bound on the range of values. Cross-correlation between the noise streams and observer responses indicated a response latency of around 0.5-1s. We discuss our results in the context of computational models of the rivalry process.

  • Vilidaite, G., Yu, M. & Baker, D.H. (2016). Highly correlated internal noise across three perceptual and cognitive modalities. Journal of Vision, 16(12): 809.

Neural variability (noise) is an important limiting factor in neural processing widely observed in neurophysiological studies. Abnormal levels of neural noise have been implicated in some neurological disorders such as Autism. Noise in the early visual system can be measured using noise masking paradigms in which stimulus noise is used to inject external variability. However, previous research has not been able to compare this noise in other cognitive modalities. We used a 2AFC double-pass paradigm (Burgess & Colbourne, 1988, J Opt Soc Am A, 5: 617-627) in which stimulus intensity was jittered on a trial-by-trial basis for three discrimination tasks: a) grating contrast; b) facial expression intensity; and c) numerical summation (in which participants were asked which set of four numbers had the higher sum). The tasks were repeated twice with 200 trials/rep and consistency of responses between the passes was calculated. We tested 43 neurotypical observers and also obtained an estimate of autistic traits with the Autism Quotient (AQ). There were substantial significant positive correlations between consistency scores across all three modalities: faces and numbers (Spearman’s ρ=.63, p< .0001), faces and contrast (ρ=.71, p< .0001) and numbers and contrast (ρ=.56, p< .0001). Furthermore, a Principal Components Analysis showed that all three consistency scores load onto a single factor, which explained 77% of the variance. Individual observers’ factor loadings were also found to be significantly negatively correlated with AQ scores (ρ=-.39, p=.009, two-tailed), suggesting that those with more autistic traits had higher internal variability. Our results imply either a single source of late decision noise, common across all tasks, or a common factor of endogenous noise across the various brain regions involved in each task. Our finding of lower response consistency in people with higher levels of autistic traits supports current theories of increased internal noise in autism.

  • Cunningham, D., Baker, D.H. & Peirce, J. (2016). Measuring Selective Responses to Coherent Plaids Using the Intermodulation Term. Journal of Vision, 16(12): 304.

The visual system combines information that is propagated from the retina in order to generate a coherent percept of the environment. While much is known about how primary visual cortical mechanisms encode for low-level image features, relatively little is known about how this encoded information is then processed by mid-level mechanisms. By frequency tagging the components of a stimulus at different temporal frequencies and measuring steady-state visual evoked potentials (SSVEPs), we can examine the individual responses to each of the components at their fundamental component frequencies as well as various nonlinear interactions. Responses at the intermodulation frequencies (the sum and difference of the component frequencies) indicate nonlinearities at or after the point of combination. These can arise from either suppressive effects or combination effects such as AND responses to the compound. We have used the multi-component frequency-tagging technique to study responses to the combination of gratings into various plaid patterns. We combined two components (1cpd and 3cpd, respectively) into orthogonal plaid patterns with either matched spatial frequencies (‘coherent’ plaids) or non-matched (‘non-coherent’ plaids). Grating components were simultaneously flickered at different frequencies (2.3Hz, 3.75Hz) resulting in fundamental component-based responses at these frequencies, as well as intermodulation responses at their difference (1.45Hz) and sum (6.05Hz). The nonlinearities generated in response to the gratings and plaids were investigated by comparing several response frequencies. These included the component frequencies, the first-order intermodulation responses, and the various harmonic responses associated with these. The technique provides a rich set of data that we can investigate with a family of computational models. From this we can determine how the various nonlinearities (suppressive, additive etc.) contribute towards different response patterns. In particular, the harmonic of the sum intermodulation frequency appears in this case to differentiate second-order mechanisms from suppressive interactions in V1.

  • Richard, B., Chadnova, E. & Baker, D.H. (2016). Dichoptic imbalance of luminance and its effects on the phase component of steady-state EEG signals. Journal of Vision, 16(12): 431.

A neutral density (ND) filter placed before one eye will produce a dichoptic imbalance in luminance, which attenuates responses to visual stimuli and generates a lag in neural signals from retina to cortex in the filtered eye (Wilson & Anstis, 1969, Am J Psychol, 82, 350-358). This, in turn, can induce disparity cues that lead to an illusory percept of depth (e.g., the Pulfrich effect). Here, we explored how the increased latency of the filtered eye alters neural responses to stimuli presented either monocularly or binocularly. We measured steady-state (SSVEPs) contrast response functions from the occipital pole at 6 different contrast values (0 to 96%) with 3 cycles/° sinusoidal gratings flickering at 5 Hz. To manipulate the balance of luminance between the eyes, neutral density filters (0.6, 1.2, and 1.8 ND) were placed in front of the dominant eye of observers while stimuli were presented at maximum contrast either to the filtered eye or to both eyes. The amplitude component of SSVEPs increased monotonically as a function of stimulus contrast and decreased as a function of filter strength in both monocular and binocular viewing conditions. For monocular stimuli, the ND filter increased the lag of the phase component of SSVEPs, up to a latency of 63 ms (95%CI +/- 31ms) at a filter of 1.8 ND. However, under binocular conditions, no apparent phase lag in the SSVEPs could be identified. This is indicative of a binocular combination process that suppresses the lagged input from the filtered eye. We explain these data with a computational model that implements a variable temporal impulse response function placed prior to a binocular contrast gain control mechanism, which, under binocular viewing conditions, suppresses the attenuated and lagged responses of the filtered eye. This model, additionally, offers insight on interocular interactions that may occur in amblyopia.

  • Coggan, D.D., Andrews, T.J. & Baker, D.H. (2016). Investigating the temporal properties of visual object processing using a multivariate analysis of EEG data. Journal of Vision, 16(12): 1311.

An understanding of human object recognition requires combining both spatial and temporal information about neural activity. Previous studies using fMRI have found distinct spatial patterns of response in the ventral visual pathway. However, the temporal dynamics of these patterns of response is less clear. Here, we acquired human electroencephalography (EEG) responses to images from different object categories (bottle, face, house). Our aims were to determine (1) whether there are distinct patterns of EEG response to different object categories; (2) the temporal properties of these patterns and (3) the extent to which these patterns are based on low-level image properties. Participants viewed images of bottles, faces and houses while EEG data was acquired from 64 electrodes. A correlation-based multivariate pattern analysis revealed distinct patterns of response across electrodes to different object categories that emerged at approximately 90 msec and remained distinct until approximately 600 msec after stimulus onset. Next, we asked whether these patterns of neural response could be explained by selectivity to more basic properties of the stimulus. To address this question, we measured patterns of EEG response to phase-scrambled images. Our rationale for using scrambled images is that they have many of the image properties found in intact images, but do not convey any categorical or semantic information. Again, distinct patterns of response to scrambled images from different object categories emerged approximately 70-100 msec after stimulus onset. Moreover, the patterns of neural response to scrambled images from each object category were similar to the patterns of response for intact images. However, distinct patterns to scrambled images were only evident until approximately 300-450 msec after stimulus onset. Together, these results provide new insights into the temporal dynamics of object processing in the human brain.

  • Gray, K.L.H., Lygo, F., Yu, M. & Baker, D.H. (2016). A common factor may underlie personality traits and both neural and perceptual responses to emotional faces.  Journal of Vision, 16(12): 75.

Difficulties in interpreting emotional faces are often linked with autistic traits and alexithymia. Yet it is unclear whether these problems are due to reduced neural processing of faces, or differences in decision strategies in perception experiments. To measure neural responses, we adapted a steady-state EEG paradigm (Liu-Shuang, Norcia & Rossion, 2014, Neuropsychologia, 52:57-72) for emotion detection. Random identity neutral faces were presented periodically (at 5Hz) with ‘oddball’ targets appearing every fifth cycle (at 1Hz). The targets were emotional faces of six emotions (angry, happy, sad, fear, disgust, surprise), morphed along a continuum relative to neutral (0, 6, 12, 24, 48, 96 and 144% emotional intensity). Lateralised occipito-parietal responses at harmonics of the oddball frequency (2, 3 & 4Hz) increased monotonically with morph level for 24 observers, and showed substantial individual variability that was unrelated to the baseline response at the flicker frequency. Observers also completed a 2IFC experiment using the same temporal parameters as the EEG experiment. Observers were presented with a target stream containing a single emotional face embedded within 8 neutral distractors, and a null stream containing only neutral faces, and asked to indicate which stream contained the emotional target. Accuracy increased as a function of morph level, and 75% correct thresholds were estimated from the psychometric functions. Inverting the faces significantly reduced detection performance and EEG responses. We also measured autism quotients and alexithymia using standard scales, and found a typical level of correlation between these measures (r=0.61, p< 0.01). Alexithymia correlated with psychophysical thresholds (r=0.49, p< 0.05), whereas autistic traits correlated negatively with maximum EEG amplitudes (r=-0.50, p< 0.05). Principal components analysis of all four variables identified a single factor with high loadings for each measure. These results suggest that personality differences are predictive of both neural responses to, and perceptual experience of, emotional expression.

  • Baker, D.H., Gutmanis, N., Harris, S., Kitching, R.E., Melton, H.A., Norman, R., Scott, C.J.T., Smith, A.K., Wailes-Newson, K. & Vilidaitė, G. (2016). Nonlinear interactions between steady-state responses to visual stimuli. Perception, 45(6): 697-698.

Flickering visual stimuli produce entrained neural activity that is easily measured using electroencephalography, termed the steady-state visually evoked potential (SSVEP). For asingle flicker frequency, the brain generates SSVEPs at the fundamental frequency and its higher harmonics, and for multiple frequencies intermodulation responses also occur at the sums and differences of the individual frequencies. We measured SSVEPs from 100 individuals for 11 s trials viewing one or two component stimuli (0.5 c/deg, 7 and 5Hz on or off flicker, EEG sample rate 1 kHz). The unprecendented signal fidelity that emerges from our large sample size permits several novel observations. The first harmonic responses (1F) decrease rapidly during the first second following stimulus onset, whereas the second harmonic responses (2F) increase during the first 2 s. We attribute this to responses from cells that code changes in the mean contrast level of a stimulus (e.g., complex cells) initially responding to differences from mean luminance (at 1F) and then signaling both increases and decreases of contrast (at 2F). With two frequencies, we observe many responses at nonintermodulation integer frequencies, including 3, 4, 9, and 16–19 Hz, particularly when data are averaged coherently across participants. Many of these are predicted by a two-input contrast gain control model (Busse, Wade, & Carandini, 2009, Neuron, 64, 931–942) fitted to the entire Fourier spectrum from 1 to 20Hz at a range of contrasts. However, the model underestimates the 2F amplitude and overestimates the amount of suppression between frequencies. We suggest a multi-stage model with several nonlinearities may be needed to explain all our results.

  • Danilenko, O., Ozolinsh, M. & Baker, D.H. (2016). Time course and interocular transfer of size and contrast adaptation aftereffects. Perception, 45(6): 698.

We performed psychophysical experiments to evaluate adaptation to size and contrast, and the dependence of these processes on stimulus texture. Stimuli were discs or rectangles presented against a mid-gray background, consisting of sine-wave or plaid patterns or areas of uniform luminance. In the adaptation phase for size adaptation experiments, a number of large disks (with diameter twice as that taken for a ‘‘standard’’) jittered in position (based on Baker & Meese, 2012, Perception, 41(S), 33) in the left side of the visual field for up to 20 s. Target stimuli were then displayed, consisting of a fixed diameter standard in the adapted hemifield, and a comparison stimulus of variable size in the unadapted hemifield. Participants indicated which of the target stimuli appeared larger in size. By varying the adaptation period, we obtained exponential time courses for size adaptation with a time constant of tau_size=7.1 s (SE=±0.6 s). Additional control experiments found no effect of the relative orientation of adaptor and target textures. Adapting to textured stimuli in the center of the visual field produced a strong orientation-specific contrast sensitivity aftereffect with a time constant that was shorter than that measured for size adaptation (tau_contr=3.9 s, SE=±0.2 s). Similar values of time constant were observed during recovery from adaptation. For contrast sensitivity, we also measured interocular transfer of adaptation by adapting and testing in the same eye or in different eyes. The latter case revealed a 75% reduction of the total adaptation depth, suggesting a strong monocular contribution to contrast adaptation.

  • Baker, D.H. (2015). Surround suppression within and between the eyes does not increase with speed. Perception, 44(10): 1238.

Suppression is a ubiquitous property of sensory processing, occurring in the visual system between stimuli that are distinct in orientation, spatial frequency, eye of origin and spatial position. For overlaid cross-oriented masks, masking increases dramatically with stimulus speed (temporal frequency divided by spatial frequency) when mask and target are presented to the same eye, but is approximately speed-invariant for dichoptic presentation (Meese, T.S. & Baker, D.H., 2009). Such a striking difference implies distinct anatomical loci for these two forms of suppression. But is this specific to overlay masking or does it generalise to lateral inhibition from surround masks? Measuring detection thresholds for peripheral targets of different spatial (0.5, 1 & 4 c/deg) and temporal (4, 8 & 15Hz flicker) frequencies, surrounded by co-oriented annular masks shown to the target or non-target eye (using stereo shutter goggles), provides a clear answer to this question. Masking decreased slightly over five octaves of stimulus speed for both monocular and dichoptic presentation. This finding supports previous observations that monocular overlay masking is distinct from other types of contrast gain control, as it is the only pathway in which suppression increases with speed. Whereas suppression from dichoptic overlay, surround, and chromatic masks share a common spatiotemporal fingerprint and are presumably cortical in origin, monocular overlay masking may arise at an earlier stage of processing (e.g. Baker, D.H., Meese, T.S. & Summers, R.J., 2007).

  • Vilidaitė, G. & Baker, D.H. (2015). What is the best way to measure internal noise in the visual system? Perception, 44(10): 1242.

Visual neural noise is typically estimated using a contrast detection-in-noise paradigm, whereby white noise is added to a target stimulus. The noise contrast at which detection performance starts to decline is taken as equivalent to the system’s internal neural noise. But is this measure consistent with the level of noise estimated by other techniques? Alternative measures include: double pass consistency, where a detection-in-noise experiment is repeated twice using the same noise samples each time, and the consistency of responses is estimated; parameter estimates of internal noise from contrast discrimination experiments; estimates of response variance from steady-state EEG. In this study we aimed to validate these techniques by comparing them directly. Stimuli were patches of sine-wave grating with a spatial frequency of 0.5c/deg, flickering at 7 Hz. For the noise experiments, we introduced variance into the detecting channel by jittering the stimulus contrast interval-by-interval. We show that parameter estimates derived by fitting a transducer nonlinearity to contrast discrimination (‘dipper’) functions were able to accurately predict both noise masking thresholds and double pass consistency data for individual observers. The noise masking data alone were insufficient to constrain the nonlinearity, and fits of a linear model overestimated the observer’s consistency (and could not account for the contrast discrimination data). These results suggest that measuring dipper functions and double pass consistency may give the most valid measure of internal noise. This could be used to investigate clinical populations, such as autism spectrum disorder (ASD), in which abnormal levels of neural noise have been implicated.

  • Cunningham, D., Baker, D.H. & Peirce, J. (2015). Using the intermodulation term as a measure of selective responses to coherent plaids. Perception, 44(S): 227-228.

Mid-level neural mechanisms that combine signals encoding low-level visual features are still relatively poorly understood. Steady state visual evoked potentials (SSVEPS) were recorded to measure the nonlinear combination of two sinusoidal gratings (1cpd and 3cpd in spatial frequency, respectively). They were orthogonally overlapped by themselves or by each other to form spatial frequency-matched (‘coherent’) or non-matched (‘incoherent’) plaids. While fundamental SSVEP responses directly represent the components of a presented stimulus, intermodulation responses represent their nonlinear combination at the point of or after summation (Spekreijse & Oosting, 1970; Spekreijse & Reits, 1982; Zemon & Ratliff, 1984). Grating components were simultaneously flickered at different frequencies (4.6Hz, 7.5Hz) resulting in fundamental component-based responses at these frequencies, as well as intermodulation responses at their difference (7.5 – 4.6 = 2.9 Hz) and sum (7.5 + 4.6 = 12.1 Hz). When the grating components formed an incoherent plaid, the sum intermodulation responses were small (if present) compared to when they formed a coherent plaid. This may represent differences in suppression from cross-orientation masking between the plaid conditions, or it may reflect selectivity for stimulus coherence. In support of the latter, the extent of fundamental response suppression that occurred for coherent and incoherent plaids was similar.

  • Baker, D.H., Smith, A.K. & Wade, A.R. (2015). Increasing cortical GABA levels through dietary intervention. Perception, 44(S): 190.

The balance of neural inhibition and excitation is key to healthy brain function, but may be abnormal in several clinical conditions, including epilepsy (Porciatti, Bonanni, Fiorentini & Guerrini, 2000). The brain’s primary inhibitory neurotransmitter is gamma-aminobutyric acid (GABA). We investigated whether increasing availability of GABA precursors could affect neural responsivity, indexed by steady state visual evoked potentials (SSVEPs). Fourteen participants were shown sine wave gratings flickering at 7Hz at a range of contrast levels, with and without an overlaid orthogonal mask flickering at 5 Hz. They then consumed a 5 ml daily dose of yeast extract, high in B vitamins and glutamate (both GABA precursors), over a four week period, before being retested using the same stimuli. The dependent variables were the SSVEP responses at the target and mask frequencies. A significant interaction between target contrast and yeast extract (F(2.09,27.17) = 4.68, p < 0.02) demonstrated that the intervention reduced neural responses at higher contrast levels by up to 20%, but did not affect baseline activity to a blank screen. This was confirmed by a main effect of the intervention on responses to the mask (F(1,13) = 5.19, p < 0.05). We conclude that dietary intake can influence neural activity, suggesting a potentially valuable supplement to seizure medications.<\span>

  • Vilidaitė, G., Gutmanis, N., Harris, S., Kitching, R.E., Melton, H.A., Norman, R., Scott, C.J.T, Smith, A.K., Wailes-Newson, K. & Baker, D.H. (2015). Estimating variability in GABA-ergic suppression across the population. Perception, 44(S): 157-158.

Individual differences in response to visual contrast are typically observed in the amount of (i) saturation of responses at high contrast levels and (ii) suppression by cross-channel masking. A potential mechanism that underlies these processes is GABA-ergic inhibition, as GABA antagonists eliminate both effects in single neurons (e.g. Morrone, Burr & Speed, 1987). However, these two properties of contrast transduction have not been directly compared within the human population. We measured steady-state EEG responses to patches of sine-wave grating s (0.5 c/deg) flickering at 7 Hz in 100 subjects at seven contrast levels (0–64%). In some conditions a high contrast (32%) orthogonal mask flickering at 5Hz was superimposed on the target stimuli. We calculated a saturation index, defined as the ratio of amplitudes at the two highest target contrast levels. To quantify masking, we took the ratio of amplitudes when the mask was present versus absent, averaged across intermediate contrast levels. We found a highly significant correlation between masking and saturation across observers (r = 0.50, p < .001), such that individuals who exhibited substantial masking also displayed strong saturation. This suggests a common underlying mechanism, most likely GABA-ergic inhibition, that varies widely in the population and therefore may have clinical relevance.<\span>

It has been proposed that distinct forms of neural suppression occur at different stages of visual processing. At a single neuron level, monocular and dichoptic cross-channel masking have very different properties consistent with pre-cortical and cortical neural substrates respectively (Li, Peterson, Thompson, Duong & Freeman, 2005, J Neurophysiol., 94: 1645-1650). Monocular cross-channel masking produces a lateral shift of the contrast response function (contrast gain) whereas dichoptic masking produces a multiplicative shift (response gain). We investigated how contrast response functions in human visual cortex are affected by four mask types: monocular and dichoptic cross-oriented (overlaid) masks, and aligned and orthogonal surround masks. We measured steady-state EEG responses at the occipital pole to 1c/deg target stimuli (contrasts 6-96%) flickering at 5Hz with or without drifting masks of various contrasts. We interpret the results in the context of a Naka-Rushton model (resp=rCn/(Sn + Cn), where C is contrast) for which the response gain parameter (r), the saturation constant (S), or both parameters could be altered by the mask. All masks reduced target signal-to-noise ratios (SNRs) at high mask contrasts, with monocular overlay and aligned surround masks showing evidence of contrast gain shifts, and dichoptic and orthogonal surround masks consistent with changes in response gain. We also plotted target responses as a function of mask contrast, revealing a novel facilitatory effect of low contrast surround masks that can be explained within the normalization framework. At high mask contrasts there was clear saturation for surround and dichoptic masks above the noise floor, which was not apparent for monocular orthogonal masks. Saturating suppression is consistent with the mask signals having passed through a stage of nonlinear transduction before impacting the target signals. These results bridge single cell studies in animals and psychophysical work in humans in delineating distinct suppressive pathways in the early visual system.

Basic visual features such as contrast are processed in a highly nonlinear fashion, resulting in ‘dipper’ shaped functions in discrimination experiments. Previous work has applied a similar paradigm to investigate the representation of higher level properties such as facial identity (Dakin & Omigie, 2009, Vision Res, 49, 2285-2296). Here we ask whether emotional expressions are processed nonlinearly by measuring discrimination thresholds for six emotions (angry, happy, sad, fear, disgust, surprise) morphed along a continuum relative to neutral. Using a 2IFC paradigm, we estimated discrimination thresholds relative to six ‘pedestal’ morph levels between 0% and 75%. The participants’ (N=5) task was to indicate which of two faces (pedestal, or pedestal plus increment) conveyed the strongest expression. We found evidence of facilitation at low morph levels (~15%) and masking at higher levels (>60%), indicating the existence of a nonlinearity in the neural representation of expression, comparable to that reported for lower level visual features. We then asked whether facial features are integrated across the face before or after this nonlinearity by keeping the expression in one half of the face (top or bottom) fixed at neutral, and applying the pedestal and increment expressions to the other half. Sensitivity decreased by around a factor of two along the entire dipper function, relative to the whole-face condition, suggesting that facial expressions are integrated before nonlinear transduction. Finally we assessed the amount of interference between the two halves of the face, by fixing the expression in one half at a given level (the ‘mask’), and applying the target increment to the other half. This produced a strong masking effect, such that target expressions needed to approach the level of the mask to be detected. This is evidence for competition between the neural representations of different facial features.

  • Coggan, D.D., Liu, W., Baker, D.H. & Andrews, T.J. (2015). Category-selective patterns of neural response to scrambled images in the ventral visual pathway. Journal of Vision, 15(12): 622.

Neuroimaging studies have found reliable patterns of response to different object categories in the ventral visual pathway.  This has been interpreted as evidence for a categorical representation of objects in this region.  However, in addition to their semantic content, categories also differ in terms of their image properties.  The aim of this study was to determine if image properties could explain category-selective patterns of neural response in the ventral visual pathway.  We hypothesized that, if patterns of response in this region are tuned to low-level image properties, similar patterns of activity should also be evident for scrambled images that contain the same low-level properties, but are not perceived as objects.  To address this issue, we generated phase-scrambled versions of intact objects in two ways: 1) globally-scrambled – applied to the whole image; 2) locally-scrambled – dividing each image into an 8×8 grid and scrambling the contents of each window independently.  A behavioral study revealed that both scrambling processes rendered images unrecognizable.  We then used fMRI to measure patterns of ventral response to five object categories (bottles, chairs, faces, houses and shoes) with three image conditions (intact, globally-scrambled, locally-scrambled).  Using multivariate pattern analysis, we found distinct and reliable patterns for all five categories in intact and locally-scrambled image types. In contrast, the globally-scrambled images only showed reliable patterns for faces and houses.  In addition, we found that the similarity matrices for the intact and locally-scrambled images were significantly correlated (r=0.79, p< 0.001).  However, the similarity matrices from the intact and locally-scrambled images were not correlated with the globally-scrambled images.  These results suggest that similar patterns of response are elicited by intact and locally scrambled images.  Taken together, these data suggest that category-selective patterns of response in the ventral visual pathway can be explained by image properties typical of different object categories.

Ocular dominance is an extensively studied form of neural plasticity. Several recent studies have demonstrated that a degree of eye dominance plasticity occurs in adults after one eye is patched for as little as 2.5 hours. Over these timescales, the patched eye, rather than the unpatched eye, becomes stronger in subsequent binocular viewing. However, little is known about the site and nature of the underlying processes. In this study, we examine the mechanisms underlying this eye dominance plasticity in adults by measuring steady-state visual evoked potentials (SSVEPs) as an index of the neural contrast response in early visual areas. The experiment consisted of three consecutive stages: a pre-patching EEG recording (14 minutes), a monocular patching stage (2.5 hours) and a post-patching EEG recording (14 minutes; started immediately after the removal of the patch). During the patching stage, a transparent patch (i.e. a diffuser, which transmits light but not pattern) was placed in front of one randomly selected eye. During the EEG recording stage, we measured contrast response functions for each eye to obtain an estimate of the contrast-dependence of the patching-induced changes. We found that patching one eye with a diffuser for 2.5 hours in adult humans increased the neural response to stimuli in the patched eye, whilst the responses from the unpatched eye remained the same. Such phenomena occurred under both monocular and dichoptic viewing conditions. We interpret this eye dominance plasticity in adult human visual cortex as homeostatic intrinsic plasticity regulated by an increase of contrast-gain in the patched eye.

To study binocular vision, it is necessary to use equipment capable of presenting distinct images to the left and right eyes. Techniques that have been developed to achieve this include mirror and prism stereoscopes, virtual reality goggles, and systems where the images are presented on a single display and separated using either active shutters, or passive filters tuned to different wavelengths or polarization angles of light. These latter methods are all subject to the phenomenon of crosstalk, where images intended for one eye are faintly visible to the other. We measured crosstalk for a variety of systems, including several active shutter methods with CRT, LCD and DLP displays, circularly polarized light imaged on three different screen materials viewed at various angles, and narrowband red/green anaglyph filters. We displayed squares of various luminances to one eye of the goggles, and measured their physical luminance through each eye. Our measure of crosstalk was the ratio of these luminances. We estimated crosstalk both when the nonstimulated eye was shown a black screen (Woods, Apfelbaum, & Peli, 2010; J Biomed Opt, 15(1):016011), and also when it was shown a mid‐grey screen to provide an estimate of contrast crosstalk. Our lowest crosstalk measures (<1%) were for a CRT monitor with fast decaying phosphor, and ferro‐electric shutter goggles. Surprisingly, some LCD systems produced negative contrast crosstalk, presumably due to polarization at the pixel level. Anaglyph glasses gave poor results, even with narrowband filters, in part because of an asymmetry in the luminance attenuation between filters.

The extent to which aftereffects transfer interocularly is thought to be informative about where adaptation occurs in the visual system. Features processed by early visual areas containing many monocular neurons will exhibit weak transfer relative to features processed by later, fully binocular areas. This logic has been applied to motion adaptation to make inferences about the level at which different forms of motion are processed. However, many previous studies have used bias‐prone methods and rarely compare different types of motion directly. We used a sophisticated speed‐matching paradigm designed to reduce bias (Morgan, 2013, J Vis, 13(8):26) to estimate interocular transfer (IOT) for five types of motion: simple translational motion, expansion/contraction, rotation, spiral and complex translational motion. We used both static and dynamic targets with subjects making binary judgements of perceived speed. The stimuli were disks of multiple Gabor elements. Test stimuli moved in the same or opposite direction to the adapters and were presented to the same or opposite eye. Overall, the average IOT was 65%, consistent with previous studies (mean over 19 studies of 67% transfer). There was a main effect of motion type, with translational motion producing stronger IOT (mean: 86%) overall than any of the more complex varieties of motion (mean: 51%). This is inconsistent with the notion that IOT should be strongest for complex motion processed in extrastriate regions that are fully binocular. It could be explained if local motion adaptation processes are dominant over more global processes at the level relevant for perception.

  • Meese, T., Baker, D. & Summers, R. (2014). Perception of global image contrast: integration and suppression of local contrasts, not MAX or RMS. Perception, 43(10): 1121-1122.

When we adjust the contrast knob on a TV set, we experience a perceptual change in global image contrast. Here we ask how that image statistic is computed. We used a contrast-matching task for checkerboard configurations of micro-patterns (known as Battenbergs), where A- and B-contrasts refer to the Michelson contrasts of 2D arrays of micro-patterns placed on the nominal ‘black’ and ‘white’ regions of the checkerboard, respectively. Mean luminance of the micro-patterns was the same as the mid-grey background. With this arrangement we could manipulate the A-contrasts (0–32%) independently of the B-contrasts, which were always set to 8% in the standard stimulus. Using a staircase procedure, the adjustable matching patterns had A- and B-contrasts that were either equal (full-match), or one of them was 0% (half-match). Stimuli were 20 × 20 arrays with check widths of either: 1, 2, 4, or 8 micro-patterns. At the extreme A-contrasts, perception depended on the highest contrast in the image. But for all four check-widths and both match types, there was a curious intermediate region where adding low A-contrast to B-contrast caused a paradoxical reduction in perceived global contrast. None of the following models predicted this: RMS, energy, linear sum, max, Legge & Foley. However, a gain control model incorporating wide-field integration and suppression of nonlinear contrast responses predicted the results with no free parameters. This model also accounts for summation of contrast at threshold, and challenging masking and summation effects in dipper functions. Thus, we conclude it represents a fundamental operation in human contrast vision.

  • Georgeson, M., Meese, T., Baker, D. & Wallis, S. (2014). Binocular summation: what is the fate of monocular signals? Perception, 43(10): 1117.

Neural responses evoked by similar features in the left and right eyes are combined in primary visual cortex, and contrast sensitivity for two eyes is better than one. This binocular summation is the neural basis for binocular fusion and stereo vision. Here we ask whether the monocular signals ever have separate access to perception, without passing through the binocular-summing pathway. Most experiments are agnostic on this question, and most models have focussed on the summing mechanism(s). Our psychophysical experiments presented the same fixed-contrast, horizontal, 1 c/deg pedestal grating to both eyes, and in a 2AFC task observers had to detect changes (increment or decrement) in the pedestal contrast of one or both eyes. When the change takes place only in one eye, the binocular mechanism should be equally good at sensing an increment or decrement, but at higher pedestal contrasts (> 5%) observers’ performance was much worse for decrements than increments. Conversely, with an increment in one eye and a decrement in the other eye, the binocular mechanism should perform poorly (because the summed changes tend to cancel each other), but observers’ thresholds were almost as good as for a one-eyed increment. These two critical results can be understood via three key principles: (i) signals from the two eyes are summed at low contrast but averaged at higher contrasts, (ii) monocular signals (L,R) are preserved in parallel with the combined one (B), and (iii) only the MAX over (L,R,B) is selected for further perceptual processing.

  • Meese, T.S., Baker, D.H. & Summers, R.J. (2014). Perception of global image contrast is predicted by the same spatial integration
    model of gain control as detection and discrimination. iPerception, 5(4): 245.

How are the image statistics of global image contrast computed? We answered this by using a contrast-matching task for checkerboard configurations of ‘battenberg’ micro-patterns where the contrasts and spatial spreads of interdigitated pairs of micro-patterns were adjusted independently. Test stimuli were 20×20 arrays with various sized cluster widths, matched to standard patterns of uniform contrast. When one of the test patterns contained a pattern with much higher contrast than the other, that determined global pattern contrast, as in a max() operation. Crucially, however, the full matching functions had a curious intermediate region where low contrast additions for one pattern to intermediate contrasts of the other, caused a paradoxical reduction in perceived global contrast. None of the following models predicted this: RMS, energy, linear sum, max, Legge & Foley. However, a gain control model incorporating wide-field integration and suppression of nonlinear contrast responses predicted the results with no free parameters. This model was derived from experiments on summation of contrast at threshold, and masking and summation effects in dipper functions. Those experiments were also inconsistent with the failed models above. Thus, we conclude that our contrast gain control model (Meese & Summers, 2007) describes a fundamental operation in human contrast vision.

How closely do computational models derived to understand psychophysical data relate to neural activity? Steady-state EEG provides a measure of the response of neural populations to visual inputs. Here, this technique is used to test the predictions of a general model of signal combination and suppression, recently proposed by Meese & Baker (2013, iPerception, 4: 1-16). Stimuli were 1c/deg gratings presented either to the left and right eyes (to test binocular combination) or interdigitated across space in micropatch ‘checks’ of four cycles (to test combination across space). The two components (left and right eyes, or adjacent spatial locations) flickered at either the same frequency (5Hz), or different frequencies (5 & ~7Hz) for a range of contrast combinations. With no free parameters, the model predicted several key findings in the complex pattern of contrast response functions that were observed empirically in both stimulus domains: (i) there is little increase in the response when a second component is added, especially at high contrasts, (ii) under specific conditions, increasing the contrast of one stimulus can reduce the overall response (analogous to Fechner’s paradox), (iii) when components flicker at different frequencies, a high contrast ‘mask’ component shifts the contrast response function to the right, (iv) when components increase in contrast together, the contrast response function is twice as steep for same frequency flicker as for different frequency flicker. The accuracy of these predictions is surprising, as the model was derived to explain data from psychophysical (contrast discrimination) experiments, with no expectation that it should generalise to other experimental paradigms. That it does so suggests that psychophysical methods are informative regarding the activity of large populations of neurons, and that the general combination model provides a good account of signal interactions across multiple dimensions.

Steady-state visual evoked potentials (SSVEPs) are contrast-dependent oscillations in the EEG signal induced by flickering visual stimuli. SSVEPs have been used to explore the mechanisms underlying binocular rivalry (Sutoyo & Srinivasan, 2009, Brain Res, 1251: 245-255), but to date they have not been used to compare activity between conscious and nonconscious vision during continuous flash suppression (CFS; Tsuchiya & Koch, Nat Neurosci, 8: 1096-1101). In CFS, high-contrast broadband masks are presented to one eye, causing stimulI in the other eye to be excluded from awareness. We investigated the effect of CFS on the SSVEP response to 1c/deg gratings (contrasts of 4-64%) or face stimuli. Targets were presented at 9Hz (sinusoidal on/off flicker) to one eye, with a Mondrian mask refreshing at 10Hz in the other eye, for trials of 11 seconds. We recorded EEG signals at 64 electrode sites, and Fourier transformed the waveforms to estimate the amplitude of the neural response at each stimulus frequency. The CFS mask reduced SSVEP responses to the targets (both gratings and faces) across the majority of the scalp. However, targets were not typically suppressed from awareness for the entire trial. When categorised by the subjective reports of the participant, no differences in SSVEP amplitude were found between conscious and nonconscious viewing at occipital electrodes. This implies a fixed level of interocular suppression in early visual areas. We also observed no amplitude modulation at more frontal electrodes, but this is likely due to insufficient activity at these sites owing to our necessarily small (~4 deg) target stimuli. We conclude that since CFS produces high levels of non-dynamic suppression in early visual areas, brain regions that correlate strongly with perception must lie further up the cortical hierarchy. Locating these with SSVEPs is challenging given the requirement to use small target stimuli to ensure complete suppression.

  • Brattan, V.C., Baker, D.H. & Tipper, S.P. Visual and motor priming effects on prediction of observed action in the first and third person perspectives. Journal of Vision, 14(10): 831.

Much evidence demonstrates the existence of a neural network of action-observation shared representations (Rizzolatti & Craighero, 2004). It has been posited that this action-observation network (AON) allows us to draw upon one’s own motor repertoire to facilitate prediction and interpretation of others’ actions (Wilson & Knoblich, 2005). If the network’s function is specifically for social understanding, then action observation should be more accurate in a third-person perspective (3PP) compared to a first-person perspective (1PP). In Experiment 1, participants viewed short action sequences of transitive actions; a hand reaching towards, grasping and removing an object from a table. The action was transiently occluded for 500ms, after which the sequence continued with an offset of between -200ms and 200ms. Participants responded whether the continuation of the action began from a point that was earlier or later than expected. Fitting logistic functions to participants’ responses demonstrated no significant difference in the point of subjective equality (PSE) between the 1PP and 3PP observed actions. In Experiment 2, prior to the same computer task, the 3PP action was primed by participants observing the experimenter perform the transitive actions. This produced an overall improvement in accuracy, but again there was no difference in PSE between perspectives. In Experiment 3, participants performed the actions themselves before completing the same task. This produced a significant difference in PSEs for 1PP and 3PP observed actions (p = .01), with motor priming preferentially improving temporal prediction of actions observed in the 1PP. The findings suggest visual experience aids prediction of 3-PP actions, whilst the predictive mechanisms of the AON draw on the motor repertoire of the observer to facilitate predictions of 1PP. The study thus questions the notion that motor experience aids prediction of others’ actions.

  • Chadnova, E., Reynauld, A., Clavagnier, S., Baker, D.H., Baillet, S. & Hess, R.F. Dynamics of dichoptic masking in the primary visual cortex. BMC Neuroscience 2014, 15(Suppl 1):P145 [DOI]

The inputs from the two eyes interact in a nonlinear fashion. This interaction can be either excitatory or inhibitory: Excitatory interaction (combination) occurs first in the primary visual cortex but little is known about the site of the inhibitory interaction (suppression). To investigate the latter, experimental paradigms typically present distinct inputs to the eyes (dichoptic presentation, one target and one mask are respectively presented to different eyes at the same time). Here we used magnetoencephalography (MEG) source imaging to establish the site of the cortical neural signature of interocular suppression in visual cortex. We presented different noise stimuli to each eye: The target-noise was presented for contrasts ranging between 0 and 32 %. The mask-contrast was presented to the other eye at fixed contrast (32%). We flickered the two noise stimuli (4 and 6 Hz) to elicit a frequency-tagged steady-state visual evoked response (SSVER) that was readily detectable in MEG traces. Four participants passively observed the visual presentation while keeping their gaze fixed on the center of the screen. The Brainstorm application was used to analyze the MEG data. MEG source time series were extracted from cortical regions of interest (ROIs) defined from the visual retinotopic maps of each participant obtained from previous fMRI acquisitions. Using the power of the cortical responses to the frequency-tagged stimuli, we constructed contrast response functions for all the ROIs (Figure 1A). To investigate dynamics of propagation of the response along visual cortical areas, the instantaneous phase of the signal was identified in each ROI. As expected, when the target was presented alone, the power of the responses was found to increase monotonically with contrast (Figure 1B, solid line). When a mask was added to the other eye, the contrast response was attenuated (Figure 1B, dotted line). Interestingly, the mask presented at a fixed contrast was also found to be gradually suppressed with increasing target contrast (Figure 1C). These effects were revealed for responses as early as the primary visual cortex. In the time-domain, we detected a progressive phase shift between the cortical responses along the ventral and dorsal streams. We characterized dichoptic suppression in the visual cortex with MEG. This suppression occurs as early as the primary visual cortex. The suppression between the inputs of varying contrast was also well defined in the MEG power signal. The temporal resolution of MEG cortical imaging enables the analysis of the phase shifts and delay of the steady-state visual-evoked response between cortical regions. When combined with individual visual cortical mapping, our method provides a temporally and spatially precise tool for the detailed elucidation of suppression in the visual processing induced by dichoptic masking.

The noise masking paradigm is widely used to assess visual deficits by measuring detection of targets embedded in broadband white noise. Recent work (Baker & Meese, 2012, Journal of Vision, 12(10):20, 1-12) demonstrates that unwanted suppression from such masks can contaminate estimates of internal variability. The magnitude of suppression can be assessed using a contrast matching paradigm, which measures the perceived contrast of a grating embedded in noise. For both dynamic and counterphase flickering noise at a range of temporal frequencies (1-19Hz), perceived contrast was reduced most severely (a factor of >4) at higher temporal frequencies. This is consistent with threshold elevation results for orthogonal grating masks (Meese & Holmes, 2007, Proc R Soc B, 274: 127-136). A second line of evidence comes from steady state visual evoked potential (SSVEP) measurements of the contrast response function to sine-wave gratings (1c/deg, 5Hz flicker) at the occipital pole (Oz). There was a marked reduction in the grating response when a high contrast noise mask was added at a temporal frequency (7Hz) that is distinct in the Fourier spectrum of the EEG. The implications of gain control suppression, as well as suggestions for how best to estimate internal noise, will be discussed.

Binocular rivalry describes the perceptual alternations that occur over time when dissimilar images are presented to the two eyes. Many factors affect dominance in rivalry, including low-level image attributes such as contrast or broadband spatiotemporal structure. Rivalry dominance can also be influenced by factors external to the stimuli, such as attention and spatial context. Recent work has found that unambiguous primes can facilitate or suppress a congruent rivalry image, depending on the prime strength, e.g. amount of luminance contrast (Pearson and Brascamp, 2008). We investigated the potential crossmodal aspects of rivalry priming. The present study asks whether similar facilitatory and suppressive effects can be seen for crossmodal primes. Each block of trials consisted of 120 3s Prime-Blank-Rivalry sequences. Rivalry stimuli were oriented Gabor patches (±45° from vertical), one to each eye. The prime stimulus matched one of the two rivalry patches. Primes were presented either visually (same image to both eyes), haptically (using a Phantom force feedback device), or in both visual and haptic domains simultaneously. Additionally, prime strength was manipulated by varying contrast (visual primes) or ridge height (haptic primes). The effect of the prime on onset dominance was modulated by prime strength (higher luminance / ridge height values). At low strengths, the prime stimulus facilitated the perception of the congruent rivalry stimulus. Conversely, at higher strengths, the prime supressed the congruent rivalry stimulus. In addition, the visual/haptic prime data showed evidence of a cross-modal additivity; both the visual and haptic components contributed to increase the effective strength of the prime.

  • Baker, D.H. & Meese, T.S. (2013). A template model predicts detection of sparse stimuli. Perception, 42: 363.

Contemporary models of contrast integration across space assume that pooling operates uniformly over the target region. For sparse stimuli, where high contrast regions are separated by areas containing no signal, this strategy may be sub-optimal because it pools more noise than signal as area increases. Little is known about the behaviour of human observers for detecting such stimuli. We performed an experiment in which three observers detected regular textures of various areas, and six levels of sparseness. Stimuli were regular grids of horizontal grating micropatches, each 1 cycle wide. We varied the ratio of signals (marks) to gaps (spaces), with mark:space ratios ranging from 1:0 (a dense texture with no spaces) to 1:24. To compensate for the decline in sensitivity with increasing distance from fixation, we adjusted the stimulus contrast as a function of eccentricity based on previous measurements (Baldwin, Meese & Baker, 2012, J Vis, 12(11):23). We used the resulting area summation functions and psychometric slopes to test several filter-based models of signal combination. A MAX model failed to predict the thresholds, but did a good job on the slopes. Blanket summation of stimulus energy improved the threshold fit, but did not predict an observed slope increase with mark:space ratio. Our best model used a template matched to the sparseness of the stimulus, and pooled the squared contrast signal over space. Templates for regular patterns have also recently been proposed to explain the regular appearance of slightly irregular textures (Morgan et al., 2012, Proc R Soc B, 279: 2754-2760).

  • Baker, D.H. & Meese, T.S. (2012). Using psychophysical reverse correlation to measure the extent of spatial pooling of luminance contrast. Perception, 41: 1512.

Area summation experiments measure the improvement in detectability of contrast as stimuli get larger. But what are the limits of this improvement, and what strategy does an observer use to integrate contrast across space? We employed a reverse correlation technique to directly estimate the size of the pooling region, and to compare two different strategies: signal selection (MAXing) and signal combination (summing). Stimuli were regular arrays of (27×27) grating patches, the contrasts of which were determined individually from a normal distribution (mean 32%, SD 10%) on each trial interval. Observers detected a contrast increment applied to a square subset of the patches (1 to 27 elements wide). Each observer completed 2000 2IFC trials per target size using a blocked method of constant stimuli design. We correlated the observer’s trial-by-trial responses with the contrast difference of each patch across trial intervals. This produced a map, akin to a classification image, that revealed the patches contributing to the observer’s decisions. But this standard approach cannot distinguish between summing and MAXing strategies. We therefore directly compared (again using correlation) trial-by-trial model predictions of observer responses for both strategies across a range of pooling windows. Summing target contrasts over space produced the strongest correlation with human behaviour, and provided an estimated pooling region of 9-13 grating cycles. This supports earlier work that had reached similar conclusions using more traditional techniques. Furthermore, individual differences in maximum pooling region correctly predicted the rank ordering across observers of the magnitude of area summation at detection threshold.

  • Meese, T.S., Baker, D.H., Summers, R.J. & Georgeson, M.A. (2012). Contrast integration and counter suppression: a general scheme for visual hierarchies? Journal of Vision, 12(14): 33.

Low contrast stimuli are much easier to detect with two eyes than with one, suggesting a process of signal summation. But in everyday life, when we close one eye, the contrast of the world does not diminish, but remains fairly constant implying that different or additional processes are involved above threshold. We have developed a generic gain control model of contrast summation that involves contrast integration along the dimension of interest and slightly less potent suppression along the same dimension: the model giveth with one hand and taketh away with the other. This counter intuitive behavior allows the model to benefit from pooling at threshold, yet maintain a contrast code that is invariant with the extent of pooling above threshold. It also suggests that it might be possible to reveal the operation of the integration process above threshold with appropriate experimental manipulations. By measuring various forms of ‘dipper functions’ we have been able to confirm this in the domains of: ocularity, space, orientation and time. The model also predicts paradoxical psychometric functions (‘swan’ functions) that we find for appropriate arrangements of target and pedestal in each of the same four dimensions. Furthermore, we show that the general arrangement that we propose is a suitable basis for building visual hierarchies and population codes for global measures along various dimensions of interest. This idea receives some direct support from novel experiments in which we reveal aftereffects for global size adaptation.

Numerous algorithms exist for removing unwanted noise from digital images. Typically these are inspired by biological visual systems and involve bandpass filtering and thresholding to remove the noise. However, the efficacy of such algorithms is usually assessed numerically (e.g. by calculating the RMS error between original and denoised images) with little regard to perceptual consequences for the end user. This can – in principle – lead to situations where two algorithms are equally ‘good’ numerically, yet one may produce highly salient artefacts whilst the other does not. We propose a novel behavioural method for comparing denoised images. Using segments of 308 images from a public image library, we measured contrast thresholds for 3 observers detecting Gaussian pixel noise added to the image in one interval of a 2IFC experiment; these constitute our baseline. We then repeated the experiment, having passed all of the stimuli (images with or without noise added) through a denoising algorithm (Fischer et al., 2007, Int J Comp Vis, 75(2): 231-246). The increase in threshold relative to the baseline (typically a factor of ~2) provides an index of the success of the denoising algorithm by giving an indication of the amount of perceptually meaningful (e.g. visible) noise that is ‘hidden’ by the algorithm. We use this technique to compare polar- and cartesian-separable log-Gabor filters, as well as filters of different orientation bandwidths. Thresholds occurred at a constant peak-signal-to-noise ratio for baseline and denoised conditions, linking numerical comparisons with a measure of perceptual validity for the end user.

  • Baker, D.H. & Meese, T.S. (2012). Size adaptation effects are independent of spatial frequency aftereffects. Perception, 41(S): 33.

Repulsive adaptation effects for spatial frequency, motion and orientation (the tilt aftereffect) are well established, and support the notion of population coding in each of these domains. We have recently proposed (Meese and Baker, 2011 Journal of Vision 11(1) 23, 1–23) that the spatial extent of an object or texture is represented in a similar way. If so, adaptation effects should exist that are sensitive to object size (eg diameter) rather than the scale of a texture (spatial frequency). Using a matching task, we measured perceived size of 4 cycles deg–1gratings before and after exposure to an adaptor that jittered in space to cover the area of the largest target. All eight of our naïve observers experienced a clear shift in perceived size—large targets looked larger and small targets looked smaller. The effect is similar in magnitude (10–20%) to spatial frequency repulsion effects (which we also replicate) but does not induce them: increasing perceived area does not increase perceived bar width. Size adaptation is robust to the relative orientation of adaptor and target, and even occurs for disparate objects such as gratings and faces. This implies adaptation of a broadly-tuned process which estimates the envelope of a stimulus.

Increasing the area of a luminance-modulated sine wave grating decreases its contrast detection threshold. The process by which individual samples from discrete locations in the visual field are combined to achieve this is investigated here by analytic modeling. Several combinations and orders of transduction, template, and summation type were considered. Predictions from these models were compared to spatial summation results measured for two different stimulus types. The first was a set of circular sine-wave gratings (4 c deg -1) of various diameters, including a subset of “Swiss cheese” gratings that were modulated by a raised plaid to halve their total contrast over area (Meese and Summers, 2007 Proceedings of the Royal Society B 274 2891–2900). The second set of stimuli were rectangular grating patches presented both in the fovea and in the periphery replicating Robson and Graham (1981 Vision Research 21 409–18). In other conditions, these stimuli were multiplied by an attenuation surface that compensated for the confounding loss of contrast sensitivity with retinal eccentricity. Our analyses reveal that the full wealth of our results can be described by a single model. This involves spatial filtering, square-law transduction and linear summation of signal and internal noise within a template matched to stimulus extent.

  • Summers, R.J., Baker, D.H. & Meese, T.S. (2012). Spatial integration within and between first- and second-order stimuli. Perception, 41(S): 223.

The detection of first-order (luminance-modulated—LM) and second-order (contrast-modulated—CM) stimuli is believed to involve separate mechanisms that interact weakly or are entirely independent; detection of an LM-only stimulus is barely improved by the addition of CM. However, little is known about the integration of stimuli comprising non-overlapping regions of LM and CM. Spatial summation of LM, CM and LM+CM targets was assessed using (i) full 1.25 c deg -1gratings of different sizes (1–16 cycles), (ii) fixed-diameter targets whose signal area was controlled by modulating a large (8 or 16 cycles) ‘full’ grating with a raised plaid pattern. The noise carrier (also present for LM stimuli) was bandpass-filtered white noise (8 c deg -1, ±0.5 octaves). We find that sensitivity improves with target size more rapidly for LM than for CM. When area was constant, comparing full and modulated stimuli yielded summation of ∼5dB for both CM and LM. We also investigated cross-order summation, which was weak (∼2dB) for full CM+LM (threshold adjusted) stimuli, but stronger (∼3dB) when first and second order stimuli were interdigitated over area. This suggests a mechanism capable of integrating textures with attributes that vary over space, perhaps owing to changes in illumination or material properties.

Masking from 2D white noise is typically attributed to an increase in variance in the detecting mechanism. Direct evidence for this comes from the double pass technique, in which noise masks increase the consistency of observers’ responses (albeit by less than expected). But noise masks must also activate non-target mechanisms, and these will have a suppressive effect on the detecting mechanism similar to that of cross-oriented grating masks. A consequence of this gain control suppression is that masks should attenuate the perceived contrast of targets. We demonstrate this using a 2IFC matching paradigm for 1c/deg horizontal log-Gabor targets embedded in either white noise or an oblique mask of three times the target frequency (3F mask). What was previously unknown is how the two contributions to masking – variance and suppression – interact with each other. We assess this here by jittering the contrast of a zero-mean pedestal on a trial-by-trial basis (e.g. Cohn, 1976, J Opt Soc Am, 66: 1426-1428), producing a noise stimulus that is entirely within-mechanism. We measured masking functions using a 2IFC procedure for this jitter mask with and without cross-orientation suppression from a high-contrast 3F mask. Intuitively, the effects of these different masks might be expected to sum. However, the standard gain control model predicts that when one source is more potent than the other, it will dominate, accounting for all of the masking. At low jitter variances, the 3F mask raised thresholds fourfold for all three observers. At higher jitter variances the masking functions converged, as predicted by the model. However, since masking by suppression and masking by variance produce identical forms of masking function, it is not possible to use (noise) masking functions to assess the equivalent internal noise unless the relative contributions of each source of masking are known. This might be difficult to achieve.

Binocular vision is traditionally treated as two processes: the fusion of similar images, and the interocular suppression of dissimilar images (e.g. binocular rivalry). Recent work (Baker & Meese, 2007, Vision Res, 47:3096-3107) has demonstrated that interocular suppression is phase-insensitive, whereas summation occurs only when stimuli are in phase. Here we examine how interocular phase difference affects perceived contrast using a matching paradigm. All stimuli were horizontal 1c/deg sine-wave gratings, presented for 200ms using shutter goggles to enable dichoptic presentation. Perceived contrast was measured for a wide range of phase offsets (0-180°) and matching contrasts (2-32%). Our results reveal a complex interaction between contrast and interocular phase. At low contrasts, perceived contrast reduced monotonically with phase offset, by up to a factor of 1.6. At higher contrasts the pattern was non-monotonic: perceived contrast was veridical for in-phase and antiphase conditions, and monocular presentation, but increased at intermediate phase angles. This finding challenges a recent model (Huang et al., 2010, PLoS ONE, 5:e15075), in which contrast perception is phase-invariant. The results were used to extend binocular contrast gain control models to include phase. The simplest model involves monocular gain control and linear binocular summation within separate phase-specific channels, followed by a MAX selection across them. Importantly, this model has only a single (zero) disparity channel and embodies both fusion and suppression processes within a single framework where the winning operation depends on interocular phase. The algorithms underlying binocular combination in humans might also inform models of image fusion in applied settings.

  • Wallis, S.A., Baker, D.H., Meese, T.S. & Georgeson, M.A. (2012). The psychometric function in spatiotemporal contrast vision: slope constancy, stimulus generality and non-stationarity. Perception, 41(3): 376-377.

The slope of the two-interval, forced-choice psychometric function (detection performance vs image contrast) provides valuable information about the relation between contrast sensitivity and signal strength. However, little is known about how, or whether, the slope varies with stimulus parameters, such as spatiotemporal frequency or stimulus size and shape. For example, magnocellular and parvocellular pathways are thought to dominate opposite corners of spatiotemporal frequency space and because they have different contrast response characteristics (magno cells are less linear), they might produce different psychometric slopes. A second unresolved issue concerns how the psychometric slope should be measured. Non-stationarity of the observer (where threshold drifts between experimental sessions) can produce underestimation of the true slope if curve fitting is done after collapsing the data across all experimental trials. We addressed these issues by repeatedly measuring psychometric functions for 2 experienced observers, with 14 different spatiotemporal configurations of grating patches and bars, over 8 days. Psychometric slope was fairly constant across conditions, consistent with a common form of nonlinear contrast transducer and/or a common level of intrinsic stimulus uncertainty. Our analysis revealed only very small levels of non-stationarity indicating that, in practice, there is little difference between averaging multiple slopes estimated from several experimental sessions, and estimating a single slopefrom data averaged over several experimental sessions. Computational models for contrast detection may thus be simplified by assuming both a common psychometric slope for different stimuli, and stationarity of threshold and slope over time.

  • Baldwin, A.S., Meese, T.S. & Baker, D.H. (2012). Extensive physiological summation of contrast signals over area revealed by Witch’s Hat compensation for retinal inhomogeneity. Perception, 41(3): 366-367.

The luminance profile of the retinal image is sampled by retinal circuitry. These pointwise representations must then be combined through neuronal convergence to represent spatially extensive patterns, surfaces and objects in the brain. The mechanisms by which this takes place can be investigated psychophysically by studying the effects of manipulating the shape of a stimulus on its contrast detection threshold. Investigations into area summation are complicated by the inhomogeneity of contrast sensitivity across the retina. Previous studies have attempted to mitigate this by presenting the stimuli in the periphery where sensitivity is relatively constant (Robson & Graham, 1981, Vision Research, 21, 409-18), however this cannot be expected to provide an accurate picture of what happens in the central visual field. We have used detection threshold data to map the contrast sensitivity of the visual field; the reciprocal of the map is an attenuation surface that normalises sensitivity across the visual field when multiplied with the stimulus. We did this for 4 c/deg sine-wave gratings and ‘Swiss cheese’ stimuli where a carrier grating is modulated by a raised plaid thereby halving its contrast summed over area. Preliminary results show more spatially extensive fourth-root summation than has previously been observed in the central visual field, extending up to 26 cycles of the carrier grating.

Tilt-shift lens technology can produce photographs of distant objects with very narrow depths of field. For scenes with appropriately placed foreground and background, fake tilt shift (FTS) effects can be achieved by applying blur gradients to the upper and lower parts of a conventional photograph. Either way, the treatment causes real scenes to look like small-scale models. This happens because the blur produces a shallow depth of field, which makes the focused object appear close, which means it must also be small to be within view. Previous attempts to study this FTS miniaturization used subjective measures of perceived distance, but these are complicated by the observer’s choice of cognitive strategy or interpretation of instructions. We improved on this method here by devising a 2AFC performance task where participants viewed pairs of achromatic railway scenes for 5 seconds. One scene was always real and the other was always a detailed 1:76 scale model (see Figure 1) and observers were informed of this. Their task was to decide which of the two was the real full-scale scene. There were six treatments of the real photographs: null, total blur, FTS blur, inverse FTS blur (i.e. blurred across the middle and sharp at the top and bottom), orthogonally oriented FTS blur (i.e. the blur gradient was orthogonal to the ground plane) and FTS blur with no gradient (i.e. a strip of focus through a blurred image). Each of 6 real photographs was given each of the treatments and compared to each of 6 model photographs. Each of 108 participants performed 6 trials in a random order. For the null treatment, observers detected reality reliably, whereas for FTS blur with and without gradients, the model world was mistaken for reality (i.e. percent correct was significantly less than 50%, corresponding with negative d-prime). Participants were around chance for the other treatments. We conclude that the most important factor for achieving FTS miniaturization is the correct alignment of the treatment with the subject/ground plane, not the inclusion of a blur gradient.

Threshold elevation following monocular adaptation is weaker in the unadapted eye than in the adapted eye. At least 15 studies have measured this interocular transfer (IOT) phenomenon, and typically report around 60% transfer. Yet almost all of these studies used spatial frequencies above 3c/deg, very slow temporal parameters, and criterion sensitive methods (method of adjustment, yes/no). In recent work, we (Meese, T.S. & Baker, D.H., 2011, iPerception, 2: 159-182) found markedly weaker interocular transfer at low spatial and high temporal frequencies. Here, we measure IOT in 9 observers for a broad range of spatiotemporal frequencies (0.5, 2 & 8c/deg; 1, 4 & 15Hz) using a 2AFC paradigm. Targets were horizontal Gabor patches with a full-width-at-half height of 1.67 (lower frequencies) or 6.68 (8c/deg) grating cycles. Adaptors were larger gratings with the same spatiotemporal properties as the targets. Observers adapted for two minutes initially, and 5 seconds between each trial, with monocular presentation enabled by shutter goggles. We typically found weaker IOT than previously reported (<50%), particularly for our fastest stimuli (lowest spatial and highest temporal frequencies) where it was virtually absent in all cases. Binocular summation and monocular adaptation were normal in all conditions. This implies that adaptation to ‘magno’ stimuli, not investigated in previous studies, occurs at a monocular locus. We also consider possible methodological confounds in classical studies which might have inflated the levels of IOT. These include the formation of retinal afterimages from static adaptors, and changes in criterion unrelated to changes in sensitivity.

  • Meese, T.S., Baker, D.H. & Summers, R.J. (2012). Fake tilt shift miniaturisation causes negative d-prime for detecting reality. iPerception, 3(4): 223.

Tilt-shift lens technology can produce photographs of distant objects with very narrow depths of field. For scenes with appropriately placed foreground and background, fake tilt shift (FTS) effects can be achieved by applying blur gradients to the upper and lower parts of a conventional photograph. Either way, the treatment causes real scenes to look like small-scale models. This happens because the blur depth-cue implies that the object is close, and therefore small to fit within the field of view. Previous attempts to study this used subjective measures of perceived distance, which are complicated by cognitive strategy. We improved on this by devising a 2AFC performance task where participants viewed pairs of achromatic railway scenes for 5 seconds. The target scene was real, the other was a detailed 1:76 scale model. There were six treatments of the real photographs: null, total blur, FTS blur, inverse FTS blur, orthogonally oriented FTS blur, FTS blur with no gradient (i.e. a strip of focus through a blurred image). Each of 6 real photographs was given each of the treatments and compared to each of 6 model photographs. Each of 36 participants performed 6 trials in random order. For the null treatment, observers detected reality reliably, whereas for FTS blur with and without gradients the model world was mistaken for reality (negative d-prime). Participants were at chance for the other treatments. We conclude that the key factor for achieving FTS miniaturization is the correct alignment of the treatment with the subject, not the inclusion of a blur gradient.

  • Baker, D.H., Meese, T.S., Georgeson, M.A. & Hess, R.F. (2011). How much of noise masking derives from noise? Perception, 40(S): 51.

In masking studies, external luminance noise is often used to estimate an observer’s level of internal (neural) noise. However, the standard noise model fails three important empirical tests: noise does not fully linearise the slope of the psychometric function, masking occurs even when the noise is identical in both 2AFC intervals, and double pass consistency is too low. This implies the involvement of additional processes such as suppression from contrast gain control or increased uncertainty, either of which invalidate estimates of equivalent internal noise. We propose that jittering the target contrast (c.f. Cohn, 1976, J Opt Soc Am, 66:1426-1428) provides a ‘cleaner’ source of noise because it excites only the detecting mechanism. We compare the jitter condition to masking from 1D and 2D (white and pink) pixel noise, pedestals and orthogonal masks, in double pass masking and (novel) contrast matching experiments. The results show that contrast jitter produced the strongest masking, greatest double pass consistency, and no suppression of perceived contrast: just as the standard model of noise masking predicts (and unlike pixel noise). We attribute the remainder of the masking from pixel noise to contrast gain control, raising concerns about its use in equivalent noise masking experiments.

  • Baker, D.H. & Meese, T.S. (2011). Varying extrinsic uncertainty affects the slope and position of the psychometric function for contrast detection and contrast discrimination. Perception, 40: 113. [Download poster, 1.7MB]

The slope of the psychometric function for contrast detection is controlled by nonlinear contrast transduction or uncertainty, or a combination of the two. For contrast discrimination, the pedestal removes intrinsic uncertainty, and contrast gain control reduces the effective exponent; both processes result in a shallower slope of the psychometric function. Manipulating extrinsic uncertainty experimentally should affect both threshold and slope but, despite its theoretical importance, this test has not been performed previously at both detection threshold and above. Here we manipulated spatial uncertainty for detection and discrimination of a pair of horizontal 4 cycles deg-1 Gabor patches placed equidistant from a central fixation point on the circumference of a virtual circle. In a temporal 2AFC paradigm, there were 1, 2, 4, or 8 possible locations for the target pairs, indicated by low contrast rings. The level of uncertainty was fixed within a block of trials, with target contrast levels determined by the method of constant stimuli. For contrast discrimination, the experiment was identical except that pedestals were presented in all locations on every trial. Thresholds and slopes increased with extrinsic uncertainty for both detection and discrimination. However, the threshold effect was greater for discrimination than for detection, confirming our prediction that intrinsic uncertainty is greater at threshold than above. We report estimated levels of intrinsic uncertainty for a range of transducer exponents (1: 3). A detailed understanding of the effects of intrinsic and extrinsic uncertainty are critical for examining effects such as collinear facilitation, for which uncertainty reduction is a common explanation.

  • Baldwin, A.S., Meese, T.S. & Baker, D.H. (2011). Retinal inhomogeneity and the witch’s hat: contrast sensitivity declines as a bilinear function of eccentricity in each direction. Perception, 40: 112.

The logarithm of contrast sensitivity has been described as a linear function of retinal eccentricity for a visual field of 120 deg (Pointer and Hess, 1989 Vision Research 29 1133-1151). Here we ask whether this is a suitable account for the central 9 deg of the visual field where most contrast sensitivity experiments are performed. We measured contrast detection thresholds for oriented cosine-phase log-Gabor stimuli with a spatial frequency of 4 cycles/deg and bandwidths of 1.6 octaves and 25°. Four meridians were tested (-45°, 0°, 45° and 90°), each with four stimulus orientations (-45°, 0°, 45° and 90°). Eccentricity was sampled in steps of 6 cycles, and 1.5 cycles in a subsample of conditions. In almost every case, we found that the initial sensitivity loss with eccentricity was steep (average = 1.1 dB/cycle), becoming shallower (average = 0.4 dB/cycle, similar to previous reports) after a critical point: a behaviour that was nicely described by a bi-linear equation. This equation also improved the fit to the Pointer and Hess results. Sensitivity to the entire central visual field was estimated by elliptical interpolation between bi-linear fits to each of the four cardinal half-meridians. This produced a sensitivity surface shaped like a “witch’s hat”, and made good predictions for the results for the oblique meridians. By testing other spatial frequencies, we aim to determine whether the location of the hat’s brim is a fixed visual angle (as might be expected on anatomical grounds) or a fixed number of stimulus cycles.

  • Baker, D.H. & Meese, T.S. (2010). ‘Dilution masking’, negative d-prime and nonmonotonic psychometric functions for eyes, space and time. Perception, 39(S): 7.

Recent work investigated contrast-interactions using targets and pedestals constructed from stimulus pairs (A, B) that were interdigitated across the domain of interest. It suggested similar gain-control frameworks for summation and suppression of contrast in the domains of space, time, ocularity and orientation. One of the properties of this framework is ‘dilution masking’. This is different from each of the well-known processes of ‘within-channel’ and ‘cross-channel’ masking and it derives from the integration of relevant target and pedestal regions (A) with uninformative pedestal regions (B). It makes a surprising prediction when the pedestal contrast in the target region (A) is reduced to 0%. If suppression between A and B is strong, the model’s contrast-response first decreases for small target contrast-increments before increasing to threshold as target contrast approaches the high mask contrast. This predicts a non-monotonic psychometric function, where discrimination performance drops below 50% correct in 2AFC, implying negative d’. We have confirmed the existence of this paradoxical ‘trough’ empirically in contrast-masking experiments for interdigitated (A, B) stimuli in each of three stimulus domains: space, time (flicker) and ocularity. But in the orientation domain, where cross-orientation suppression is relatively weak, the effect is not found, consistent with the gain-control model.

  • Meese, T.S. & Baker, D.H. (2010). Summation and suppression of luminance contrast across eyes, space, time and orientation: the equation of the visual brain. Perception, 39(S): 6.

To understand summation of visual signals we also need to understand the opposite process: suppression. To investigate both processes we measured triplets of dipper functions for targets and pedestals involving interdigitated stimulus pairs (A, B) within each domain of interest (task: detect A on A; A on A+B; A+B on A+B). Previous work showed that summation and suppression operate over the full contrast range for the domains of ocularity (A and B are left and right eyes) and space (A and B are adjacent neighbourhoods). Here we include orientation and time domains and a contrast-matching task. Temporal stimuli were 15Hz counter-phase sine-wave gratings, where A, B were the positive and negative phases of the oscillation. For orientation, we used orthogonally oriented contrast patches (A, B) whose sum was an isotropic difference-of-Gaussians. Results from all four domains could be understood within a common framework: summation operates independently within the excitatory and suppressive pathways of a contrast gain control equation. Subtle differences across conditions were explained by variable strengths of divisive suppression. For example, contrast-matching confirmed that A+B has greater perceived contrast than A in the orientation domain but not the spatial domain, suggesting cross-orientation suppression is weaker than spatial suppression.

  • Baker, D.H., Georgeson, M.A., Wallis, S.A. & Meese, T.S. (2010). Difference between target and background luminance determines the rule for binocular combination. Perception, 39(8): 1149.

Binocular combination of luminance can be investigated by matching binocularly unequal target stimuli to binocularly equal standards. Such experiments typically produce quasi-linear equibrightness contours, which fold back at the extremes owing to Fechner’s paradox (Levelt, 1965 British Journal of Psychology 56 1 – 13). For decrements against a bright background, however, Anstis and Ho (1998 Vision Research 38 523 – 539) reported highly nonlinear functions which imply a winner-take-all rule. Are these very different findings due to the sign of the contrast, the luminance of the target, the luminance of the background, or all three? We performed binocular matching experiments for increments on a black background (2), and either increments, decrements, or both (a light-dark edge) against a grey background (~10 cd/m2). Stimuli were uniform or bipartite discs (diameter 1 deg) or a Gabor patch (1 cycle/deg), and matches were obtained by measuring the point of subjective equality with a 2IFC staircase procedure. Results showed that grey backgrounds were associated with nonlinear combination across eyes, which was particularly severe (close to winner-take-all) for high-contrast decrements, consistent with Anstis and Ho (1998). For increments on a black background, combination was nonlinear at low target luminances (<=2 cd/m2) but became increasingly linear towards higher luminances (>2 cd/m2). Fitting a generic model of binocular combination reveals that the exponent governing summation is inversely related to the (signed) difference between the luminance of the target and that of the background.

  • Baldwin, A.S., Meese, T.S. & Baker, D.H. (2010). Loss of contrast sensitivity at 4 cycles/deg depends on eccentricity and meridian but not grating orientation for the central 9 deg of the visual field. Perception, 39(8): 1151.

Surprisingly, there has been no detailed study of the relation between contrast sensitivity and stimulus orientation across the central visual field. Here we measured contrast detection thresholds for cosine-phase log-Gabor stimuli with a spatial frequency of 4 cycles/deg, duration of 100 ms and bandwidths of 1.6 octaves and 25 deg. There were 4 meridians (-45°, 0°, 45°, 90°), 4 stimulus orientations (45°, 0°, 45°, 90°) and 4 eccentricities (0°, ±1.5°, ±3°, ±4.5°), giving a total of 100 conditions in a randomised blocked design. To reduce extrinsic uncertainty, a low contrast ring (diameter of 0.75 deg) was presented continuously at the appropriate position in the visual field, where it surrounded the stimulus. A similar ring in the centre of the display aided fixation. We found no evidence for the meridional resolution effect of Rovamo et al (1982 Investigative Ophthalmology & Visual Science 23 666-670) or the oblique effect. For two observers, ASB and DHB, the loss of sensitivity with eccentricity averaged 0.6 and 0.82 dB per cycle, respectively; a little more severe than previous reports. In general, sensitivity declined less rapidly for the horizontal meridian than for the vertical meridian for each stimulus orientation. The sensitivity functions were slightly concave on log-linear axes and preliminary analysis attributed the anisotropy to the initial slopes of bi-linear fits – the second parts of the slopes being fairly uniform. These results will help to constrain the interpretation of previous and future studies addressing the details of spatial integration of contrast in the central visual field.

  • Georgeson, M.A., Meese, T.S. & Baker, D.H. (2010). Detecting contrast differences in binocular and dichoptic vision: we use monocular or binocular channels, whichever gives the MAX response. Journal of Vision, 10(7): 350.

Two eyes are often better than one. Models based on binocular summation of signals from each eye, with interocular contrast gain control and a single binocular output channel, account well for detection, discrimination and perception of monocular and binocular contrast. We now ask whether monocular signals also remain available to perception. Horizontal 1c/deg sine-wave gratings of contrast C were presented to both eyes for 200ms in 2AFC discrimination tasks, to determine whether contrast increments (C+dC) in one eye were more difficult to detect when accompanied by contrast decrements (C-dC) in the other eye. Summation or averaging over the two eyes should make these opposite changes cancel. Results consistently showed no cancellation. Binocular increments or decrements were more detectable than monocular ones, but thresholds for the hybrid increment/decrement condition were close to those for monocular contrast increment (on the binocular pedestal). Since the binocular channel must suffer cancellation, its absence here implies that monocular signals can remain available to perception and decision, alongside the combined binocular response. Despite this, monocular decrements of contrast on a binocular pedestal were unusually difficult to detect. An extended version of our 2-stage gain-control model (Meese, Georgeson & Baker, Journal of Vision 2006), now incorporating left-eye, right-eye and binocular channels, accurately explained the patterns of threshold variation over at least 7 distinct forms of dipper function. Importantly, the model observer is assumed to pick only the MAX response across the 3 types of channel. This normally arises from the binocular channel, which can thus occlude useful information in the monocular channels. But when the pedestal gratings are out-of-phase in the 2 eyes, interocular suppression wipes out the binocular response, and monocular channels mediate the task. This switch from binocular to monocular responses may be the early, local bias for binocular rivalry.

Classical studies of area summation – in which detection thresholds are measured as a function of target diameter – confound summation of signal with summation of internal noise, and are compromised by retinal inhomogeneity. A “swiss cheese” stimulus recently introduced by Meese & Summers (2007; Proc Biol Sci, 274, 2891-2900) was designed to avoid these problems by keeping target diameter constant and modulating the contrast of interdigitated ‘check’ regions. This approach has revealed substantial area summation at and above detection threshold. Here, we investigate the spatial limits of this integration process over a range of carrier frequencies (1 – 16c/deg) and modulator frequencies (0.25 – 32cycles/check). We used two experimental designs: a simple method in which component ‘check’ thresholds were compared with those for their linear sum, and a normalization method in which the strength of each component in the compound stimulus depended on its detectability. The second design was of particular benefit for large check sizes (low spatial frequency modulators). Plotting results as functions of carrier cycles per check revealed contrast summation to be scale invariant for both designs. Summation remained strong (~6dB) up to at least 4cycles/check, implying linear physiological summation over 8 carrier cycles or more, and declined monotonically for larger check sizes. We consider area summation models involving spatial filtering, nonlinear transduction, linear summation over a fixed region, and Minkowski summation over multiple regions. These analyses support our conclusion that physiological summation of contrast occurs over a minimum of 8 carrier cycles after the initial stage of linear spatial filtering.

  • Baker, D.H. & Graf, E.W. (2009). Surround motion affects speed encoding at an early stage of processing. Perception, 38(S): 9.

Surround motion can strongly modulate the perceived speed of a central stimulus, yet the mechanisms behind this process are unknown. Using translating gratings (1 cycle deg-1, 1 deg s-1) surrounded by filtered noise textures, we conducted experiments to measure spatiotemporal tuning, contrast dependency and envelope properties of surround modulation in two directions. Plotted in terms of relative surround speed, perceived (matched) speed followed a sigmoidal function, saturating at the fastest speeds, ruling out a simple differencing process. Effect size increased with temporal frequency (speed * SF) and showed some spatial frequency tuning. Reductions in perceived speed saturated as a function of surround contrast and were constant with envelope blur. Perceived speed increases were weaker for sharper envelopes, but increased with surround contrast. We then asked whether surround effects occur before or after pattern motion computation. Using plaid stimuli with components ±45°, we measured the PSE for global plaid direction with and without surrounds drifting along the motion axis of one component. We observed substantial shifts (up to 20°) in the perceived plaid direction, consistent with surround-induced perceived speed changes. No effect was found for a grating drifting in the pattern direction. This suggests that surround effects occur before pattern integration in extra-striate areas (MT).

The spatiotemporal structure of natural images has characteristic amplitude and phase spectra. For example, the distribution of spatial and temporal frequency information is proportional to 1/fα, where f is frequency and α has a value near unity. The visual system seems optimized to these properties, with discrimination performance and gain control mechanisms most efficient when α≈1 (e.g. McDonald & Tadmor, 2006, Vis Res, 46: 3098-3104). Here, we ask if binocular rivalry is sensitive to properties typical of natural scenes. We used filtered 2D noise (tinted red or blue to aid identification) and varied the value of α in either the spatial or temporal domain in two separate experiments. All stimuli were equated for RMS contrast and presented dichoptically in counterbalanced, pairwise factorial combination (2 experiments, 15 unique pairs each, 4 observers, 5 repetitions, 1-min trials). We found that stimuli for which α=1 showed the greatest predominance in both the spatial and temporal domains. We compare these findings to perceived contrast measurements for the same stimuli, and the total contrast energy in each image after passing through a model contrast sensitivity function. We conclude that the strong contrast dependency of rivalry is the mechanism by which binocular vision is optimized for viewing natural images. Additionally, we compared rivalry between natural and phase-scrambled images. With stimuli equated for total energy, images with natural phase structure were dominant for 70% of the trial duration (averaged over 8 images and 6 observers for a total of 576 1-min trials). We ruled out the effects of bias using a simulated rivalry condition, which produced an average natural image dominance of 50% (i.e. no bias). This evidence indicates that binocular rivalry is preferentially sensitive to the properties of natural images across space, time and phase.

  • Meese, T.S., Georgeson, M.A., Baker, D.H., Holmes, D.J., Challinor, K.L. & Summers, R.J. (2009). Suppression and summation in contrast gain control for human vision. Perception, 38, 627.

Over the last ten years our understanding of the ascending visual pathway has improved enormously. The long-standing model of probability summation amongst multiple independent mechanisms with static output nonlinearities responsible for masking is obsolete. It has been replaced by a much more complex network of additive, suppressive and facilitatory interactions and nonlinearities across eyes, area, spatial frequency and orientation that extend well beyond the classical receptive field (CRF). A review of a substantial body of psychophysical work performed by others and ourselves, leads us to the following tentative account of the processing path for signal contrast. The first suppression stage is monocular, isotropic, non-adaptable, accelerates with RMS contrast, most potent for low spatial and high temporal frequencies, and extends slightly beyond the CRF. Second and third stages of suppression are difficult to disentangle but are probably pre- and post-binocular summation and involve components that are: spatiotemporally invariant, isotropic, adaptable, achromatic and dichoptic; isotropic and chromatic; anisotropic and achromatic; substantially larger than the CRF, orientation tuned and saturated by contrast. The monocular excitatory pathways begin with half-wave rectification, followed by a preliminary stage of half-binocular summation, a square-law transducer, full binocular summation, pooling over phase, cross-mechanism facilitatory interactions, additive noise, linear summation over area and a slightly uncertain decision maker. The purpose of each of these interactions is far from clear, but the system benefits from area and binocular summation of weak contrast signals followed by ocularity and area invariances (fractal object contrasts don’t change when you close one eye or get closer) owing to the suppressive gain control. One of many remaining challenges is to determine the stage or stages of spatial tuning in the excitatory pathway.

Perceived motion of drifting plaid stimuli is bistable over a wide range of component angles and spatial frequencies (Hupé, J-M. & Rubin, N., 2003, Vision Res, 43: 531-548). Perception alternates between coherent pattern motion and transparent component motion. Given previous findings associating saccades with percept transitions for some bistable stimuli (van Dam, L.C.J. & van Ee, R., 2006, Vision Res, 46: 787-799), we explored the relationship between perceived plaid motion and eye-movements in ten observers. Besides a standard plaid motion condition, during which observers were instructed to fixate centrally, we also included two surround motion conditions (moving dots with speed and direction consistent with the coherent or transparent percept), and two guided eye-movement conditions, where observers tracked a moving fixation point. Observers reported their percept continuously as coherent or transparent using a mouse (60s trial duration). Behaviourally, surround motion and guided eye-movements biased the proportion of coherent/transparent percepts by 5-10%. This occurred largely through extending the durations of percepts directionally congruent with the surround motion or guided eye-movements. Saccades were longer and more numerous in the surround motion or guided eye-movement direction. For all conditions, a number of perceptual transition reports were preceded by blinks, giving a measure of observer response lag (500-1000ms). Saccades congruent with percept direction showed a different pattern, following perceptual transitions. We conclude that i) percept changes elicit eye-movements in the direction of the percept, ii) saccades can prolong an existing percept and iii) surround motion might capture eye-movements, which in turn influence perception.

  • Baker, D.H. & Graf, E.W. (2008). A common factor underlying binocular rivalry and dichoptic masking. Perception, 37(S), 1.

When incompatible images are shown to the two eyes, two empirical phenomena are observed: monocular detection thresholds are elevated (dichoptic masking) and the perceived image changes over time (binocular rivalry) . It has recently been shown (van Boxtel et al, 2007, Journal of Vision, 7, 14-3) that these two phenomena have similar perceptual dynamics when images are presented successively. Here, we report a common underlying factor between rivalry and dichoptic masking during simultaneous presentation. Using orthogonal Gabor patches (2cpd) we measured threshold elevation for dichoptic masking, as well as the mean dominance duration of binocular rivalry, in a group of 41 subjects. Both threshold elevation and dominance durations varied substantially across observers, and were highly correlated (r=0.44, p<0.01) such that stronger dichoptic masking was associated with longer dominance durations. Within subjects, we also varied the angle between dichoptic Gabors, producing a similar pattern of results. These findings are accounted for by a single computational model in which the weight of interocular suppression determines both threshold elevation and dominance durations.

Dominance periods in binocular rivalry can be influenced by contextual information and spatial relationships. Recently, Alais et al (2006, Vis Res, 46: 1473-1487) demonstrated that pairs of collinear elements tend to alternate together, suggesting a mechanism (contour integration) by which image features may be bound together during rivalry. To investigate this, we constructed curved monocular contours of between one and five Gabor elements (4cpd) equidistant from fixation in either the left or right hemifield, and rivalling with binary noise. Observers reported when all Gabors were either present or absent, giving an index of alternation coherency. Elements consistent with a continuous contour were coherent for a greater proportion of trials than expected by chance (calculated as 2/2n; n is number of elements). Elements orthogonal to the contour were less coherent, though still greater than chance, whereas reported coherence for randomly oriented elements was at chance. These effects disappeared (or reversed) when successive elements were presented to different eyes, indicating that the binding effects are eye-specific.

Here we assess whether summation of contrast occurs over eyes and space conjointly. Stimuli were sine-wave gratings (2.5 cycles deg-1) spatially modulated by cosine- and anticosine-phase plaids. This produced patchy gratings where patches were placed at the centres of either the `black’ or `white’ checks of a notional checkerboard. One eye was presented with pedestal patches in one of these locations (eg `black’) and the other eye was presented with pedestal patches in the other locations (eg `white’). Contrast increments were presented to one or both eyes (single or dual increments, respectively). Conventional dipper functions were found, but the dual increments were shifted downwards by 4.8 dB. We considered 192 model architectures containing each of the following four elements in all possible orders: (i) linear summation or a MAX operator across eyes, (ii) linear summation or a MAX operator across space, (iii) linear or accelerating contrast transduction, and (iv) additive Gaussian stochastic noise. Formal equivalences reduced this to 48 different models, only 4 of which were consistent with our empirical estimates of summation ratios and slopes of the psychometric functions. 2 of these were rejected by considerations outside the present work. Our preferred model was: linear summation across eyes followed by nonlinear contrast transduction, linear summation across space, and late noise. Results were inconsistent with a MAX operator across eyes but a MAX operator across space remains a viable alternative for the stimulus conditions here. In any case, suprathreshold pooling of contrast across different regions of the retina in different eyes is a property of human vision at threshold and above.

  • Baker, D.H. & Graf, E.W. (2008). Perceived and true speeds have the same effect on binocular rivalry. Perception, 37, 310-311.

The relative dominance of gratings engaged in binocular rivalry can be influenced by their surroundings. For drifting stimuli, central gratings opposing the background motion are more dominant (Paffen et al, 2004 Vision Research 44 1635-1639). Such centre-surround stimulus configurations can, however, produce a profound change in perceived speed (Norman et al, 1996 Perception 25 815-830). We used rivalling orthogonal Gabor patches (1 cycle deg-1, 100% contrast, ±45deg), drifting at 0.5 deg s-1, embedded in a noise texture drifting at the same speed. Varying the direction of the noise affected the dominance of each grating in the direction expected from previous work. We then used a spatial 2AFC task to match the speed of a noise-embedded Gabor (standard) with that of a Gabor surrounded by mean luminance (test). As expected, background motion produced substantial changes in perceived speed; at least by a factor of two for all subjects. Lastly, we simulated the context experiment by using gratings (surrounded by mean luminance) moving at different physical speeds, as determined by the matching data. We found the same pattern of dominance as for the context experiment. This suggests that perceived and true speeds influence rivalry in the same manner, perhaps at the same neural locus. Since direction-tuned suppressive and facilitatory surround processes occur in area MT, these findings imply a key role for this brain area in rivalry, through either modulating signals directly or by feedback to earlier visual areas.

  • Baker, D.H., Meese, T.S., Patel, K. & Sarwar, W. (2007). Interocular suppression is scale invariant, but ipsiocular suprression is weighted by flicker speed. Perception, 36(S), 60.

In human and cat there are two routes to suppression for orthogonal masks: a broadband, non-adaptable, ipsiocular pathway, and a more narrowband, adaptable interocular pathway. We investigated the strength of both types of suppression in humans across spatio-temporal scale using orthogonal pairs of superimposed Gabor patches (mask and target) flickering at four spatial (0.5, 1, 2, 4 cycles deg-1) and two temporal (4 and 15 Hz) frequencies. Mask and target were presented to the same eye or different eyes in 2IFC cross-orientation masking experiments. Masking functions were normalized to baseline detection thresholds and fit by a two-stage model of contrast gain control (Meese et al, 2006 Journal of Vision 6 1224 – 1243) developed to accommodate cross-orientation masking. The weight of ipsiocular suppression was proportional to the square-root of stimulus speed (TF/SF), as in the binocular case (Meese and Holmes, 2007 Proceedings of the Royal Society of London, Series B 274 127 – 136). However, dichoptic-masking functions superimposed, showing that the interocular, presumably cortical, process is scale-invariant. These findings have implications for studies of amblyopia, binocular rivalry, and single-cell physiology.

  • Baker, D.H., Meese, T.S. & Patryas, L. (2007). Binocular summation is more tightly tuned to spatial frequency, orientation and spatial phase than interocular suppression. Perception, 36(9), 1401.

Binocular vision involves at least two interactions between the eyes: interocular suppression and binocular summation. Both contribute to dichoptic masking, but the second also contributes to facilitation. Here we used a 2AFC contrast-masking paradigm and horizontal 1 cycle deg target gratings (200 ms) to characterise the spatial properties of these two processes. In experiment 1, dichoptic masks were the same as the target but were either in-phase or out-of-phase. For in-phase masking, suppression was strong (log-log slope of ~1) at moderate mask contrasts and above, and there was weak facilitation at low mask contrasts. Anti-phase masking was weaker (log-log slope of ~0.6) and there was no facilitation. The in-phase function set the parameters of our model (Meese et al, 2006 Journal of Vision 6 1224-1243), which predicted the anti-phase function when binocular summation was selective for phase, but interocular suppression was not. In experiment 2, the spatial frequency and orientation tuning of both processes were measured with the use of high-contrast dichoptic masks. By using masks in-phase and out-of-phase with the target we were able to decouple the masking produced by the two processes. Interocular suppression had an orientation bandwidth of ±30deg, and a spatial frequency bandwidth >2 octaves. Binocular summation was much more narrowly tuned with an orientation bandwidth of ±7.5deg, and a spatial frequency bandwidth of

Contrast vision in strabismic amblyopia is characterised by (i) threshold elevation in the amblyopic eye, (ii) poor binocular summation at threshold, and (iii) abnormal dichoptic masking (Harrad and Hess, 1992 Vision Research 32 2135-2150). We develop this here by reporting contrast masking functions for five strabismic amblyopes. Patches of horizontal grating with spatial frequencies of 0.5 or 3 cycles deg-1 were presented to the same (monoptic, left and right), different (dichoptic, left and right), or both (binocular) eyes (five conditions in total). All subjects had higher thresholds in the amblyopic eye and typically showed substantial levels of dichoptic masking in each eye. Otherwise, the subjects fell into two groups. In one group (n=2), small levels of dichoptic facilitation were found, similar to normal observers (Meese et al, 2006 Journal of Vision 6 1224 – 1243). The results from this group were strikingly similar to those of a normal observer with a neutral density filter in front of one eye. In all cases, the loss of binocular summation could be attributed to the low sensitivity in the affected eye. The other group (n=3) showed no evidence of dichoptic facilitation and their loss of binocular summation could not be attributed to a loss of contrast sensitivity in the affected eye. One possibility is that their eyes operate independently, with perceived stimulus strength determined by the most active ocular channel (be that mask or test), resulting in dichoptic masking (without suppression) and no binocular summation. Our findings suggest that the visual architectures amongst strabismic amblyopes might vary considerably.

  • Baker, D.H. & Meese, T.S. (2006). Cross-orientation suppression occurs before binocular summation: evidence from masking and adaptation. Journal of Vision, 6(6), 821.

The threshold elevation produced by a grating mask with very dissimilar orientation from a target is sometimes called cross-orientation suppression (XOS). Once thought to be a single process within visual cortex, recent single-cell studies suggest earlier processes specific to eye of origin (e.g. Li et al. 2005, J Neurophysiol, 94(2), 1645-1650). Here, we investigate interocular XOS psychophysically using 1c/deg horizontal test gratings and cross-oriented masks. Masking functions for monoptic and dichoptic masks did not superimpose when plotted against contrast (0%-45%@200ms; Experiment 1) or duration (25-400ms@45%; Experiment 2). For example, monoptic XOS decreased and dichoptic XOS increased, as functions of duration. These results reject models in which XOS occurs only after binocular summation because such models predict that dichoptic and monoptic masking are identical. An unexpected finding was that a monoptic + dichoptic mask condition produced less masking than the dichoptic mask alone, suggesting interocular suppression of the mask components prior to dichoptic XOS. In Experiment 3, we found that dichoptic, but not monoptic, masking was reduced by adapting to the mask, consistent with cat physiology and a cortical locus for dichoptic masking. We propose a quantitative model of all our data where XOS is: (i) non-adaptable (and possibly precortical) for the monoptic case and (ii) adaptable (and presumably cortical) for the dichoptic case. This model also explains the finding that binocular XOS does not adapt (Foley & Chen, 1997, Vis Res, 37(19), 2779-2788) because in that condition, the adaptable contribution to XOS is negligible due to the interocular suppression described above.

In monoptic and dichoptic masking paradigms, test and mask stimuli are presented to the same and different eyes, respectively. By using mask and test stimuli that are sufficiently different not to excite the same detecting mechanism, suppressive processes can be investigated without the complicating problem of excitatory summation. In the case of binocular stimulation, this type of experiment has led to the concept of contrast gain control by a broad-band pool of suppressive mechanisms [eg Foley, 1994 Journal of the Optical Society of America A 11 1710 – 1719]. One possibility is that broad-band suppression occurs after binocular summation, in which case cross-orientation masking for monoptic and dichoptic conditions should be identical. We tested this prediction on three observers by measuring contrast masking functions at two different stimulus durations (50 ms and 200 ms), for horizontal patches of 1 cycle/deg test grating in the presence of either an orthogonal 1 cycle/deg mask or an oblique 3 cycles/deg mask. We also measured masking as a function of stimulus duration (25 ms to 400 ms) for zero and high-contrast masks (45%). We found that: (i) masking increased as a function of mask contrast, and (ii) monoptic and dichoptic masking decreased and increased as functions of duration, respectively. In no case did monoptic and dichoptic masking functions superimpose. These results suggest a scheme in which cross-orientation suppression occurs within and between the eyes, and where both of these effects must impact before binocular summation. These results and conclusions show some parallels with recent reports for two different processes of cross-orientation suppression at a cellular level (eg Li et al, 2005 Journal of Neurophysiology 94 1645 – 1650).

  • Meese, T.S., Georgeson, M.A. & Baker, D.H. (2005). Interocular masking and summation indicate two stages of divisive contrast gain control. Perception, 34(S), 42-43.

Our understanding of early spatial vision owes much to contrast masking and summation paradigms. In particular, the deep region of facilitation at low mask contrasts is thought to indicate a rapidly accelerating contrast transducer (eg a square-law or greater). In experiment 1, we tapped an early stage of this process by measuring monocular and binocular thresholds for patches of 1 cycle/deg sine-wave grating. Threshold ratios were around 1.7, implying a nearly linear transducer with an exponent around 1.3. With this form of transducer, two previous models (Legge, 1984 Vision Research 24 385 – 394; Meese et al, 2004 Perception 33 Supplement, 41) failed to fit the monocular, binocular, and dichoptic masking functions measured in experiment 2. However, a new model with two-stages of divisive gain control fits the data very well. Stage 1 incorporates nearly linear monocular transducers (to account for the high level of binocular summation and slight dichoptic facilitation), and monocular and interocular suppression (to fit the profound dichoptic masking). Stage 2 incorporates steeply accelerating transduction (to fit the deep regions of monocular and binocular facilitation), and binocular summation and suppression (to fit the monocular and binocular masking). With all model parameters fixed from the discrimination thresholds, we examined the slopes of the psychometric functions. The monocular and binocular slopes were steep (Weibull beta ~3 – 4) at very low mask contrasts and shallow (beta ~1.2) at all higher contrasts, as predicted by all three models. The dichoptic slopes were steep (beta ~3 – 4) at very low contrasts, and very steep (beta > 5.5) at high contrasts (confirming Meese et al, loco cit.). A crucial new result was that intermediate dichoptic mask contrasts produced shallow slopes (beta ~2). Only the two-stage model predicted the observed pattern of slope variation, so providing good empirical support for a two-stage process of binocular contrast transduction.

In experiments reported elsewhere at this conference, we have revealed two striking results concerning binocular interactions in a masking paradigm. First, at low mask contrasts, a dichoptic masking grating produces a small facilitatory effect on the detection of a similar test grating. Second, the psychometric slope for dichoptic masking starts high (Weibull beta~4) at detection threshold, becomes low (beta~1.2) in the facilitatory region, and then unusually steep at high mask contrasts (beta > 5.5). Neither of these results is consistent with Legge’s (1984 Vision Research 24 385 – 394) model of binocular summation, but they are predicted by a two-stage gain control model in which interocular suppression precedes binocular summation. Here, we pose a further challenge for this model by using a ‘twin-mask’ paradigm (cf Foley, 1994 Journal of the Optical Society of America A 11 1710 – 1719). In 2AFC experiments, observers detected a patch of grating (1 cycle/deg, 200 ms) presented to one eye in the presence of a pedestal in the same eye and a spatially identical mask in the other eye. The pedestal and mask contrasts varied independently, producing a two-dimensional masking space in which the orthogonal axes (10 x 10 contrasts) represent conventional dichoptic and monocular masking. The resulting surface (100 thresholds) confirmed and extended the observations above, and fixed the six parameters in the model, which fitted the data well. With no adjustment of parameters, the model described performance in a further experiment where mask and test were presented to both eyes. Moreover, in both model and data, binocular summation was greater than a factor of root 2 at detection threshold. We conclude that this two-stage nonlinear model, with interocular suppression, gives a good account of early binocular processes in the perception of contrast.

  • Georgeson, M.A., Meese, T.S. & Baker, D.H. (2005). Binocular summation, dichoptic masking and contrast gain control. Journal of Vision, 5(8), 797.

We consider a classical question – how signals from the two eyes are combined – in the context of contemporary models of contrast gain control. In 2AFC experiments, observers had to detect the presence of a test grating (1 c/deg, 200 ms) in one or both eyes, in the presence or absence of a similar masking (‘pedestal’) grating in one or both eyes. We found a high degree of binocular summation when pedestal contrast was low or zero, while at higher contrasts we confirmed Legge’s (1984) paradoxical finding that there was no advantage for detecting binocular contrast increments over purely monocular ones. In a new variant, however, we found that, on a binocular pedestal, binocular increments were better detected than monocular ones. This implies that there is binocular summation of test signals even in the suprathreshold task. Importantly, there is also binocular summation of suppressive (gain control) signals: monocular increments were harder to detect on a binocular pedestal than on a monocular one. The pattern of results can be largely, but not completely, understood through a binocular version of the standard gain control equation: Resp(binoc) = (Lp +Rp)/(sq+Lq+Rq), expressing the output of a binocular channel to contrasts L,R in the left and right eyes, with p,q,s constant (p~2.4, q~2). With additive noise, this mechanism correctly predicts the high thresholds and unusually steep, step-like psychometric functions that we observed in dichoptic masking (test in one eye, pedestal in the other). But this mechanism under-estimates both facilitation and binocular summation at low contrasts, so we shall consider what modifications are needed. Viable options include more than one output channel, and more than one stage at which nonlinear transduction and gain control operate.

Last updated 2/1/20

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: