of Virtual Reality
VR actually can be classified into three stages -- passive, exploratory, and immersive. Passive VR refers to experiences that most people are familar with in every day life -- watching TV, seeing movies, reading books, or visiting amusement parks. Exploratory VR is interactively exploring a 3D environment solely through the monitor of a computer. Immersive VR is the classic stage of VR, where the user can fully interact with the artificial environment, is provided stimulation for all the senses, and have their actions directly affect the computer generated environment. Please consult the references at the end of this article for more information about the history and roots of VR.
Physically our eyes are fairly complicated organs. Specialized cells form structures which perform several functions -- the pupil acts as the aperture where muscles control how much light passes, the crystalline lens performs focusing of light by using muscles to change it's shape, and the retina is the workhorse converting light into electrical impulses for processing by our brains. Our brain performs visual processing by breaking down the neural information into smaller chunks and passing it thoguh several filter neurons. Some of these neurons detect only drastic changes in color, others neurons detect only vertical edges or horizontal edges.
Depth information is conveyed in many different ways. Static depth cues include interposition, brightness, size, linear perspective, and texture gradients. Motion depth cues come from the effect of motion parallax, where objects which are closer to the viewer appear to move more rapidly against the background when the head is moved back and forth. Physiological depth cues convey information in two distinct ways -- accommodation, which is how our eyes change their shape when focusing on distant objects, and convergence, which is a measurement of how far our eyes must turn inward when looking at objects closer than 20 feet. We obtain stereoscopic cues by extracting relevant depth information by comparing the left and right views coming each of our eyes.
Our sense of visual immersion in VR comes from several factors which include field of view, frame refresh rate, and eye tracking. Limited field of view can result in a tunnel vision feeling. Frame refresh rates must be high enough to allow our eyes to blend together the individual frames into the illusion of motion and limit the sense of latency between movements of the head and body and regeneration of the scene. Eye tracking can solve the problem of someone not looking where their head is oriented. Eye tracking can also help to reduce computational load when rendering frames, since we could render in high resolution only where the eyes are looking.
The sense of virtual immersion is usually achieved via some means of position and orientation tracking. The most common means of tracking include optic, ultrasonic, electromagnetic, and mechanical. All of these means have been used on various head mouted display (HMD) devices. HMDs come in three basic varieties including stereoscopic, monocular, and head coupled. The earliest stereoscopic HMD was Ivan Sutherland's Sword of Damocles, which was built in 1968 while he was a student at Harvard. It got its name from the large mechanical position sensing arm which hung frm the ceiling and made the device ungainly to wear. NASA has built several HMDs, chiefly using LCD displays which had poor resolution. The University of North Carolina has also built several HMDs using such items as LCD screens, magnifying optics and bicycle helmets. VPL Research's EyePhone series were the first commercial HMDs. A good example of a monocular HMDs is the Private Eye by Reflection Technologies of Waltham MA. This unit is just 1.2 x 1.3 x 3.5 inches and is suspended by a lightweight headband in front of one eye. The wearer sees a 12-inch monitor floating in mid air about 2 feet in front of them. The BOOM is head coupled HMD and was developed at NASA's Ames Research Center. The BOOM uses two 2.5 inch CRTs mounted within a small black box that has two hand grips on each side and is attached to end of articulated counter-balanced arms serving as position sensing.
Our sense of sound localization comes from three different cues. Interaural time difference is a measure of the difference in time when a sound enters our left ear versus entering our right ear. Interaural intensity difference is a measure of how a sound's intensity level drops off with distance. Acoustic shadow is the effect of higher frequency sounds being blocked by object between us and the sound's source.
In VR systems computer generated sound comes in several different forms. The use of stereo sound adds some level of sound feedback to the VR environment, but does not correctly resemble the real world. When using 3D sound, we can "place" sounds within the simulated environment using the sound localization cues described above. A 3D sound system usually begins by recording the differences in sound that reaches both of our ears by placing microphones at each ear. The recordings are then used to produce what is called a head related transfer function (HRTF). These HRTFs are used during playback of recorded sounds to effectively place them within a 3D environment. A virtual sound system requires not only the same sound localization cues but must change and react in realtime to move those sounds around within the 3D environment. An example of a 3D sound system is the Convolvotron, developed by Crystal River Engineering. This system convolves analog audio source material with the HRTFs, creating a startingly realistic 3D sound effect. Another system called the Virtual Audio Processing System (VAPS) mixes the noninteractive binaural recording techniques and Convolvotron-like signal processing to produce both live and recorded 3D sound fields. A recent development is the attempt at performing what is called aural ray tracing, which is similar to light ray tracing found in computer graphics.
In VR systems, tactile and force feedback devices seek to emulate the tactile cues our haptic system relays to our brains. Several examples of force feedback devices have been built. The Argonne Remote Manipulator provided force feedback via a mechanical arm assembly and many tiny motors. This device was used in the molecular docking simulations of the GROPE system. The Portable Dextrous Master used piston-like cylinders which were mounted on ball joints to pass force feedback from a robot's gripper to the operator's hand. The TeleTact Data Acquisition Glove used two gloves -- one glove acquired touch data via an array of force-sensitive resistors and relayed that information to a second glove which provided feedback via many small air bladders that inflated at the pressure points to simulate the touch information.
In many VR systems the ubiquitous data glove plays the same role as that of the mouse in modern computer systems. The VPL Data Glove is perhaps the most well known. It used a series of fiber optic cables to detect the bending of fingers and used magnetic sensors for position and orientation tracking of the hand. Mattel's PowerGlove is the most popular amongst VR system hackers, due to its low cost, despite being discontinued by the manufacturer several years ago. This glove used electrically resistive ink to sense finger position and used ultrasonic sensors for detecting hand orientation.
Silicon Mirage, Steve Aukstakalnis and David Blatner, Peachpit Press, 1992
Virtual Reality: Through the New Looking Glass, Ken Pimentel and Kevin Teixeira, Windcrest Books, 1992
1993 IEEE Virtual Reality Annual International Symposium
AI Expert magazine, Miller Freeman Inc.
Virtual Reality World magazine, Mecklermedia
lingard@wpi.wpi.edu