Digital Amphitheatre

An important motivation in the design of the Digital Amphitheatre was providing a feeling of presence, so that all participants believe they are in the same location i.e, an amphitheatre, a classroom or a beachside resort. This necessitates removing differing backgrounds from each participant, and substituting them with a common background of choice.

Background Substitution

Background substitution

The background substitution process requires an initial background image to use as a baseline for comparison. Once the camera has been positioned and adjusted for use during the meeting, the participant moves out of the field of view of the camera for a few seconds to allow the software to collect several frames of the background. These images are averaged together to provide a low-noise estimate of the background.

After this brief training period, the participant returns to his seat. The region of the current video image that has changed significantly from the background is then segmented from the rest of the image, allowing the background to be substituted, using a textbook background segmentation algorithm.

A direct comparison between the current and background frames is made difficult by features common to many commodity video cameras including lighting changes, automated exposure, dynamic white balance, and increased noise. For this reason, there is a scaling step in our algorithm: we compare pixels primarily on their color, but allow the apparent intensity to vary in order to compensate for changes in brightness. The resulting distances are thresholded to produce a binary mask labeling the pixels as foreground or background.

It is also possible that natural backgrounds, such as an office, contain small regions that are difficult to distinguish from the foreground. We apply morphological operators to the mask to compensate for small regions of anomalous color match: the mask is eroded by a radius of two pixels to remove most of the isolated regions caused by noise in the current frame; the mask is then dilated by roughly twice the erosion radius to fill in voids (an additional erosion step may be performed, depending on the amount of noise in the image); the mask is finally eroded once more such that the total number of erodes and dilates balance to zero to restore the outer boundary of the foreground

We a low-complexity algorithm, with acceptable performance. Performance suffers when the background is subject to large changes in lighting: a more dynamic approach to updating the stored background image would improve performance. The system also has to be retrained if the background image changes, although fortunately training is a simple process.