by: Sasha Nikitinska
The human eye is deceiving, everything we see is not how it necessarily is. We’ve all seen one too many optical illusions to prove this. This is a known fact, but what does that entail? In the past, researchers have diverted their focus to 2-D eye error, creating mathematical models to explain eye tendencies. Scientists have been looking at 2-D error to establish how humans misinterpret speed, direction, and clarity. However, now the focus has switched to 3-D perception. A mathematical model to explain 3-D error would account for practical advancements in determining danger in traffic situations to sports vision. But this is more difficult than it seems.
In 2002, the first mathematical model for 2-D error was created after years of trying to explain the reason why humans don’t comprehend the information they see accurately. The Bayesian Mathematical Framework was used for these models and determined several factors in 2-D perception: speed is underestimated in low contrast environments and other factors in vision knowledge. Bayesian mathematical framework focuses on determining the probability of an event, in this case, perception error, based on a specific list of factors.
3-D motion, however, is more complex involving lateral motion bias, the direction of motion, and a large variety of viewing settings such as changing backgrounds, peripheral vision distraction, etc. Lateral bias, which is how we misinterpret what we see in the x plane, has been researched frequently because of the ability to measure this with ease. Researchers have also attempted to recreate real-life scenarios and measure with multiple factors that may affect misperception, however, these are limited and hard to factor in so many variables without a substantial mathematical model.
Emily Cooper, a research professor at UC Berkeley, and her team consisting of members from the University of Wisconsin, Princeton University, and Dartmouth College, took on the task of creating a mathematical model to represent 3-D vision error. Their goal was to combine the multiple aspects of vision into a model to explain the common naturalistic misperceptions that commonly occur due to stimulus distance, contrast, geometric considerations, and optimal inference.
Cooper’s team decided to derive a Bayesian mathematical model and then to test it. To test their model, Cooper and her team conducted three separate experiments involving observers to indicate the direction of miscellaneous objects randomly chosen in orientation and direction in the x-z plane. Experiment one centered around stimuli projected from a macintosh computer and through an Oculus Rift DK I VR System. The virtual reality headset measured the overall movement of the head: including yaw, pitch, and roll of the head. Participants watched a room with a white sphere, or stimulus, at the center of the room. The stimulus would move in random directions. The stimulus was presented in three various contrasts: 100%, 15%, and 7.5%. Participants then placed a flat white paddle at angle and location so that it would hit the white sphere perpendicularly after the stimulus disappeared from view. The intention of this was to take a measurement of where and what direction the subject perceived the stimulus to go. The participants completed 12-15 practice trials with the experimenter and then concluded a self-paced block with random contrasts.
The experiment led Cooper’s team to observe two factors that played a role in participant error: the contrast of the stimulus and the distance between the observer and the stimulus. Using positive values for the z plane errors, or medial errors, and negative values for the x plane errors, or lateral errors, in the paddle placement, Cooper’s team used this data to determine where the eye had more error. Using the previously derived formula, the results were compared against calculated predicted error values. The mathematical model seemingly accurately incorporated distance and contrast into the formula.
Cooper’s team set up a secondary experiment to verify their findings in the first by eliminating the virtual reality portion. Using a CRT, cathode Ray Tube Display, three subjects proceeded to do the same test with a stereoscope, a device that combines two images taken at two different angles to create one 3-D image. Similar to the last setup there was a stimulus, except this time instead of one sphere, there was a line of spheres. Subjects placed a paddle perpendicularly to the line. The contrasts varied in the same way: half the stimulus dots were darker or brighter than the background. The experiment validated the results of the first experiment, resulting in the same outcomes.
However, one more aspect of the experiment was not addressed: does the model work from different locations? Cooper’s team set up an additional experiment with 21 members of the University of Wisconsin community. The apparatus was set up similarly, however, there were three locations for stimuli: one in the center, and two 20 degrees left or right of the center. Subjects were not allowed to move their head but were allowed to move their eyes; this allowed for a difference in the 3-D space and not to the retina. The participants each completed 360 trials with stimuli always at full contrast. Using the mathematical model once again, the experiment intended to record the mean angular error, motion in depth direction confusions (z plane), and lateral direction confusions (x plane).
The mathematical model proved to be very close to accurate in its predictions compared to the actual values. Experiments 1 and 2 proved that inaccuracy, or noise, increased as the contrast decreased. Both the model and the experiment proved that the contrast is not linearly proportional to the error, meaning that error increases more quickly at lower contrasts. Results also showed that as participants accurately understood the lateral movement, the depth movement was less accurate, and vice versa. Lateral confusion seemed to be less common but nonetheless affected by viewing distance and contrast.
While the model seems to explain the majority of 3-D perception aspects, it does not explain why observers perceived something as incoming or receding. This is the next step in mathematical derivation. Seeing that the experiment was always done in a controlled setting, applying this to real-world scenarios is another important step in finding the benefits of this information.
Scientists hope to use this information to advance sports vision science, a growing new concentration. This mathematical model could give a running back more information on the incoming ball, or a golfer more practical information on the distance of the course. In a more practical sense, this information can be used for determining the safety of various driving conditions such as rain, snow, haze, fog, or night driving. Technology and new regulations can be created to help eliminate the amount of car crashes we see in these types of weather conditions. Creating this model is a first step into the possibility of work done to allow us, humans, to see what is actually going on around us.
Image credit: Amanda Dalbjorn