Computer vision people tracking in interactive installations

Computer vision people tracking for interactive installations is a runtime problem as much as a sensing problem. The goal is low latency, stable tracks, and behaviour that stays usable when light, density, and occlusion change.

Technical article cover for computer vision people tracking in interactive installations

Real-time people tracking for interactive installations is rarely about perfect data. The main problem is keeping interaction clear when the space is crowded and lighting changes during the day, even when the background is visually complex.

There are many camera and sensing options for interactive installations, and we have tested a lot of them in production installs. Most can work in the right conditions. The differences show up once a computer vision pipeline has to run continuously in a fixed physical setup. At that point three factors matter most: low end-to-end latency, stable detections, and a system that can run 24/7 with predictable behaviour.

Raw detections are noisy. They jitter or drop for a few frames under occlusion and jump in depth when lighting or crowd density changes. Passing that directly into content produces flicker and unstable responses. A lightweight tracking and filtering step rejects weak detections, matches detections to existing tracks using position and depth, then smooths outputs over time with hysteresis and short holds so positions stay stable without adding noticeable lag.

Hardware and model choice for an interactive installation are always tied to the environment: light levels, mounting height and angle, distance, field of view, crowd density, and background complexity. Many setups can work, but each has limits that show up quickly if those constraints are ignored.

When stability is critical, redundancy is often more practical than searching for a single perfect configuration. Depending on context, that can mean combining multiple sensing layers or defining a simple fallback behaviour that keeps the interactive installation usable if tracking quality drops.

Supporting illustration for Computer vision people tracking in interactive installations

In practice, the key questions are design questions: what motion in the space should be tracked, what should be ignored, what latency is acceptable, and how the system should behave when the scene becomes ambiguous instead of simply failing.

This is the technical layer that lets an installation feel effortless while the space keeps moving.

Related capability: Sensing and spatial response.

Related project: Nespresso New York interactive video wall.