Microsoft HoloLens 2: Improved Research Mode to facilitate computer vision research

已发布 2020年8月28日

作者 Marc Pollefeys , Partner Director of Science

分享这个页面

Lifestyle image of male wearing a Hololens 2 device

Since its launch in November 2019, Microsoft HoloLens 2 has helped enterprises in manufacturing, construction, healthcare, and retail onboard employees more quickly, complete tasks faster, and greatly reduce errors and waste. It sets the high-water mark for intelligent edge devices by leveraging a multitude of sensors and a dedicated ASIC (Application-Specific Integrated Circuit) to allow multiple real-time computer vision workloads to run continuously. In Research Mode, HoloLens 2 is also a potent computer vision research device. (Note: Research Mode is available today to Windows Insiders and soon in an upcoming release of Windows 10 for HoloLens .)

Compared to the previous edition, Research Mode for HoloLens 2 has the following main advantages:

In addition to sensors exposed in HoloLens 1 Research Mode, we now also provide IMU sensor access (these include an accelerometer, gyroscope, and magnetometer).
HoloLens 2 provides new capabilities that can be used in conjunction with Research Mode. Specifically, articulated hand-tracking and eye-tracking which can be accessed through APIs while using research mode, allowing for a richer set of experiments.

With Research Mode, application code can not only access video and audio streams, but can also simultaneously leverage the results of built-in computer vision algorithms such as SLAM (simultaneous localization and mapping) to obtain the motion of the device as well as the spatial-mapping algorithms to obtain 3D meshes of the environment. These capabilities are made possible by several built-in image sensors that complement the color video camera normally accessible to applications.

HoloLens 2 has four grayscale head-tracking cameras and a depth camera to sense its environment and perform articulated hand tracking. It also has two additional infrared cameras and accompanying LEDs that are used for eye tracking and iris recognition. As shown in Figure 1, two of the grayscale cameras are configured as a stereo rig, capturing the area in front of the device so that the absolute depth of tracked visual features can be determined through triangulation. Meanwhile, the two additional grayscale cameras help provide a wider field of view to keep track of features. These synchronized global-shutter cameras are significantly more sensitive to light than the color camera and can be used to capture images at a rate of up to 30 frames per second (FPS).

Figure 1: Hololens 2 Research Mode enables access to the gray-scale, depth camera and IMU sensors on device. This complements the color camera normally available to applications.

The depth camera uses active infrared (IR) illumination to determine depth through phase-based time-of-flight. The camera can operate in two modes. The first mode enables high-framerate (45 FPS) near-depth sensing, commonly used for hand tracking, while the other mode is used for lower-framerate (1-5 FPS) far-depth sensing, currently used by spatial mapping. As hands only need to be supported up to 1 meter from the device, HoloLens 2 saves power by reducing the number of illuminations, which results in the depth wrapping around beyond one meter . For example, something at 1.3 meters will appear at 0.3 meters in HoloLens 2 in this case. In addition to depth, this camera also delivers actively illuminated IR images (in both modes) that can be valuable in their own right because they are illuminated from the HoloLens and reasonably unaffected by ambient light. Azure Kinect uses the same sensor package, but with slightly different depth modes.

With the newest Windows Insider release of Windows 10 for HoloLens, researchers now have the option to enable Research Mode on their HoloLens devices to gain access to all of these external facing raw image sensors streams. Research Mode for HoloLens 2 also provides researchers with access to the accelerometer, gyroscope, and magnetometer readings. To protect users’ privacy, raw eye-tracking camera images are not available through Research Mode. Researchers can access eye-gaze direction through existing APIs.

For other sensor streams, researchers can also still use the results of the built-in computer vision algorithms and can now also choose to use the raw sensor data for their own algorithms.

The sensors’ streams can either be processed or stored on device or transferred wirelessly to another PC or to the cloud for more computationally demanding tasks. This opens a wide range of new computer vision applications for HoloLens 2. HoloLens 2 is particularly well suited as a platform for egocentric vision research as it can be used to analyze the world from the perspective of a user wearing the device. For these applications, HoloLens devices’ abilities to visualize results of the algorithms in the 3D world in front of the user can be a key advantage. HoloLens sensing capabilities can also be very valuable for robotics where these can, for example, enable a robot to navigate its environment.

These new HoloLens capabilities will be demonstrated at a tutorial (opens in new tab)on August 28th, 2020, at the European Conference on Computer Vision (ECCV). (opens in new tab) An initial set of sample apps is being made available showcasing computer vision use cases on GitHub (opens in new tab), and you can check out the Research Mode documentation (opens in new tab) for further technical details.