• Realism: there is a real world to sense
  • Positivism: all we really have to go on is the evidence of senses, so the world might be nothing more than an elaborate hallucination
  • Euclidean: referring to the geometry of the world; parallel lines remain parallel as they are extended in space, objects remain the same size and shape as they move around in space, the internal angles of a triangle always add to 180 degrees, and so forth
  • The geometry of retinal images is non-Euclidean, because the three dimensional world is projected onto the curved, two-dimensional surface of the retina. Parallel lines do not necessarily remain parallel in the retinal image, the angles of a triangle do not always add up to 180 degrees and the retinal area occupied by an object gets smaller as the objects moves farther away from the eyeball. Meaning we have to reconstruct the Euclidean world from non-Euclidean input
  • Binocular disparity: the differences between the two retinal images of the same scene. Disparity is the basis for stereopsis, a vivid perception of three-dimensionality of the world that is not available with monocular vision
  • Stereopsis: the ability to use binocular disparity as a cue to depth

Monocular cues to three-dimensional space

  • We use depth cues to infer aspects of the three-dimensional world from our twodimensional retinal images

Occlusion

  • Occlusion: a cue to relative depth order in which, for example, one objects obstructs the view of part of another object o Wrong only in the case of accidental viewpoints
    • Nonmetrical depth cue: it just gives us the relative ordering of occluders and occludes, it tells us nothing about depth magnitude
  • Metrical depth cue: a depth cue that provides info about distance in the third dimension

 

Size and position cues

  • Projective geometry: the geometry that describes the transformations that occur when the three-dimensional world is projected onto a two-dimensional surface
  • Relative size: the visual system knows that, all else being equal, smaller things are farther away. This cue only gives info about the relative size of objects, without knowing the absolute size of any one of them
  • Texture gradient: a depth cue based on the geometric fact that items of the same size form smaller images when they are farther away. An array of items that change in size smoothly across the image will appear to form a surface tilted in depth
  • Relative height: the observation that objects at different distances from the viewer on the ground plan will form image at different heights in the retinal image. Objects farther away will be seen as higher in the image o For objects on the ground plane, objects that are more distant will be higher in the visual field
  • Familiar size: a depth cue based on knowledge of the typical size of objects
  • Relative metrical depth cue: a depth cue that could specify, for example, that object A is twice as far away as object B, without providing info about the absolute distance to either A or B → Relative size, relative height, linear perspective, motion parallax
  • Absolute metrical depth cue: a depth cue that provides quantifiable info about distance in the third dimension → familiar size, accommodation, convergence

Aerial perspective

  • Haze/aerial perspective: a depth cue based on the implicit understanding that light is scattered by the atmosphere. More light is scattered when we look through more atmosphere, thus more distant objects are subject to more scatter and appear fainter, bluer, and less distinct o Short wavelengths are scattered more than medium and long wavelengths

Linear perspective

  • Linear perspective: a depth cue based on the fact that lines that are parallel in the threedimensional world will appear to converge in a two-dimensional image o However if the parallel lines lie in a plane that is parallel to the plane of the twodimensional image, such as the image of a closed door, the lines will remain parallel in the retinal image
  • Vanishing point: the apparent point at which parallel lines receding in depth converge

Pictorial depth cues and pictures

  • Pictorial depth cues: the cues produced by projection of the three-dimensional world onto the two-dimensional surface of the retina
  • Realistic pictures or photographs are the result of projecting the three-dimensional world onto the two-dimensional surface of film and canvas. When that image is viewed from the correct position, the retinal image formed by the two-dimensional picture will be the same

as the retinal image that would have been formed by the three-dimensional world, and hence we see depth in the picture

  • Anamorphosis/anamorphic projection: use of the rules of linear perspective to create a twodimensional image so distorted that it looks correct only when viewed from a special angle or with a mirror that counters the distortion

Motion cues

  • Motion parallax: a depth cue that is based on head The geometric info obtained from an eye in two different positions at two different times is similar to the info from two eyes in different positions in the head at the same time (stereopsis) o When you change your viewpoint, objects closer to you shift position more than objects farther away

Accommodation and convergence

  • Convergence: the ability of the eyes to turn inward; the more we have to converge and the more the lens has to bulge (accommodation) in order to focus on the object, the closer it is -Divergence: the ability of the eyes to turn outward

Binocular vision and stereopsis

  • Corresponding retinal points: states that points on the retina where the monocular retinal images of a single object are formed are at the same distance from the fovea in each eye
  • Vieth-Müller circle: the location of objects which images fall on geometrically corresponding points in the two retinas
  • Horopter: the location of objects whose images lie on corresponding points. The surface of zero disparity
  • Panum’s fusional area: the region of space, in front and behind the horopter, within which binocular vision is possible
  • Diplopia: double vision. If visible in both eyes, stimuli falling outside of Panum’s fusional area will appear diplopic

 

  • Crossed disparity: the sign of disparity created by objects in front of the plain of fixation (the horopter). The term crossed is used because images of objects located in front of the horopter appear to be displaced to the left in the right eye, and to the right in the left eye
  • Uncrossed disparity: the sign of disparity is created by objects behind the plane of fixation (the horopter). The term uncrossed is used because images of objects located behind the horopter will appear to be displaced to the right in the right eye and to the left in the left eye

 

Stereoscopes and stereograms

  • Stereoscope: a device for simultaneously presenting one image to one eye and another image to the other eye. They can be used to present dichoptic stimuli for stereopsis and binocular rivalry o Proves that the visual system treats binocular disparity as a depth cue, regardless whether it is produced by actual or simulated images of a scene →binocular disparity is a necessary condition for stereopsis
  • Free fusion: the technique of converging or diverging the eyes in order to view a stereogram without a stereoscope o If you relax your eyes just a bit, when crossing your eyes, (convergence) when you look at two pictures, a third one will appear in the middle. The far-left picture is seen only in the left eye, the far-right one only in the right eye and the picture in the middle is the fusion of the picture seen by the left eye and the one seen by the right eye. This fusion of separate images seen by the two eyes makes stereopsis possible -Stereoblindness: the inability to make us of binocular disparity as a depth cue

Random dot stereograms

  • At first it was thought that, in free-fusion, you first analyse the input as a picture. You would then use the slight disparities between the left-eye and right-eye images to enrich the sense of the 3D image
  • Instead Julesz theorized that stereopsis might be used to discover objects and surfaces in the world, he thought that stereopsis might help to reveal camouflaged objects o Random dot stereograms (RDSs): a stereogram made of a large number of randomly placed dots. Random dot stereograms contain no monocular cues to depth → disparity is sufficient for stereopsis
  • Cyclopean: referring to stimuli that are defined by binocular disparity alone

Stereo movies, TV and video games

  • Anaglyphic glasses: one red lens, one blue/green lens, the images were filtered through separate filters so that one eye saw one set of images and the other eye saw the other set o Later polarization was used to separate the images
  • Modern 3D glasses contain liquid crystals: alternate frames of the movie are presented to different eyes
  • Lenticular painting: the images are digitally split and interleaved –left, right, left, right,…- at a fixed spacing. A special array of many tiny lenses placed on the screen ensures that the left eye only sees the left image and the right eye only sees the right. With this technique no glasses are needed!!

Using stereopsis

  • With eyes only a few centimetres apart you don’t get adequate disparity from distant targets
  • Stereopsis is also used in reading X-ray images of, for instance, mammograms

Stereoscopic correspondence

  • Correspondence problem: the problem of figuring out which bit of the image in the left eye should be matched with which bit in the right eye
  • If you only use the low-spatial frequency of two pictures, it is easier to combine them, because you reduce the amount of info making it easier to determine what goes with what. You can work from here to determine how to match up the high-spatial frequency dots, edges and so forth.
  • Constraints to achieve correspondence:
    • Uniqueness constraint: the reality that a feature in the world is represented exactly once in each retinal image, which simplifies the problem
    • Continuity constraint: holds that, except at the edges of objects, neighbouring points in the world lie at similar distance from the viewer. Accordingly disparity should change smoothly at most places in the image

 

The physiological basis of stereopsis

  • The fundamental requirement for stereopsis is that the input from both eyes must converge onto the same cell. This is a task that the binocular striate cortex neurons are well suited for since they have two receptive fields, one in each eye, that are generally very similar in both eyes, sharing nearly identical orientation and spatial-frequency tuning as well as the same preferred speed and direction of motion. Many binocular neurons respond best when the retinal images are on corresponding points in the two retinas, thereby providing a neural basis for the horopter. However there are also binocular striate cortex neurons that are tuned to a particular binocular disparity

 

  • Nonmetrical stereopsis: might just tell you that a feature lies in front of or behind the plane of fixation → disparity tuned neurons in V2 and higher cortical areas →what pathway o Neurons that respond positively to disparities near zero, thus images that fall on corresponding retinal points
    • Neurons that are tuned to a range of crossed (near) disparities o Neurons that are tuned to a range of uncrossed (far) disparities
  • Metrical stereopsis: stereopsis is a hyperacuity with thresholds smaller than the size of a cone →where pathway o Phase-shifted inputs between the receptive fields of the left and right eye, this phase-shift corresponds to a binocular disparity and can be used to derive depth from the two retinal images

Combining depth cues

  • Unconscious inference: the automatic combination of depth cues. None of the depth cues work in every possible situation and none are fool proof

Bayesian approach

  • Prior knowledge could influence estimates of the probability of a current observation o P(A|O) = P(A) x P(O|A)/P(O): enables us to calculate the probability (P) that the world is in a particular state (A) given a particular observation (O)
  • Can be figured out from the prior probability of state A, P(A), multiplied by the probability of this observation, given state A, P(O|A), divided by the probability of collecting observation O, P(O).
  • The core approach of the Bayes’ theorem is to determine which possibility seems the most likely to be true
  • The prior probability is influenced by cues like familiar size, and accidental viewpoint
  • Ideal observer analysis: if we know the quality of each source of info, we can determine the best possible performance of someone using that info

Binocular rivalry and suppression

  • The more interesting stimuli is likely to be dominant o Interesting: which stimulus is more noticeable to the early stages of cortical visual processing; high contrast is better noticeable than low contrast, bright is better than dim, moving objects are more interesting than stationary ones, and so forth
  • The meaning of a stimulus also has an effect, as does what you are attending to
  • Binocular rivalry: the competition between the two eyes for control of visual perception o Part of the larger effort of the visual system to come up with the most likely version of the world, given the current retinal images
    • Dissociates the stimulus on the retina from the stimulus that you see

Development of binocular vision and stereopsis

  • Stereopsis suddenly appears between 3-5 months of age
  • Stereoacuity: a measure of the smallest binocular disparity that can generate a sensation of depth
  • The neural apparatus in V1 of new-borns is capable of combining signals from the two eyes and it is sensitive to interocular disparities o Dichoptical sine wave gratings identical in frequency, orientation, contrast and velocity show higher binocular responses when shown in phase-shift, than responses from the left or right eye alone
    • But no stereopsis present because, V1 cells are still immature and thus don’t have the adult sensitivity and overall respond less than adult neurons. Receptive-field properties in V2 mature later than in V1 Abnormal visual experience can disrupt binocular vision
  • Binocular vision does not have to be developed, however the normal development of binocular vision and stereopsis does require visual experience.
  • Strabismus: a misalignment of the eyes such that a single object in space is imaged on the fovea of one eye, and on the non-foveal area of the other eye o Esotropia: strabismus in which one eye deviates inward
    • Exotropia: strabismus in which one eye deviates outward
  • Individuals who exhibited strabismus during the first 18 months of their life do not show the normal interocular transfer, like the one found in the tilt aftereffect
    • Due to strabismus normally corresponding points in the two eyes receive conflicting info. The most common pattern of suppression is the input from the eye that is turned in/out
  • Strabismus greatly reduces the number of binocular neurons in the visual cortex, cells that would normally be driven by both eyes are dominated by only one. This disrupts stereopsis, thus if surgical correction is needed, this has to be done before the onset of stereopsis in order to minimize the damage.