Project Rocket platform—designed for easy, customizable live video analytics—is open source

已发布

作者 , Principal Researcher , Principal Researcher , Principal Researcher , Technical Fellow & Chief Technology Officer, Azure for Operators

code lines

Thanks to advances in computer vision and deep neural networks (DNNs) in what can arguably be described as the golden age of vision, AI, and machine learning, video analytics systems—systems performing analytics on live camera streams—are becoming more accurate. This accuracy offers opportunities to support individuals and society in exciting ways, like informing homeowners when a package has been delivered outside their door, allowing people to give their pets the attention they need when out for the day (opens in new tab), and detecting high-traffic areas so cities can consider adding a stop light.

While DNN advancements and DNN inference are enablers, they alone are not enough when it comes to extracting valuable insights from live videos. Live video analytics requires keeping up with video frame rates, which can be as fast as 60 frames per second, making it crucial to effectively filter frames and avoid the costly processing of each frame. Project Rocket (opens in new tab) provides a framework to do exactly that.

Rocket-powered

Spotlight: On-demand video

AI Explainer: Foundation models ​and the next era of AI

Explore how the transformer architecture, larger models and more data, and in-context learning have helped advance AI from perception to creation.

Rocket—which we’re glad to announce is now open source on GitHub (opens in new tab)—enables the easy construction of video pipelines for efficiently processing live video streams. You can build, for example, a video pipeline that includes a cascade of DNNs in which a decoded frame is first passed through a relatively inexpensive “light” DNN like ResNet-18 (opens in new tab) or Tiny YOLO (opens in new tab) and a “heavy” DNN such as ResNet-152 (opens in new tab) or YOLOv3 (opens in new tab) is invoked only when required. With Rocket, you can plug in any TensorFlow (opens in new tab) or Darknet (opens in new tab) DNN model. You can also augment the above pipeline with, let’s say, a simpler motion filter based on OpenCV background subtraction (opens in new tab), as shown in the figure below.

diagram

The above figure represents one of several video pipelines that can be built for efficient, customizable live video analytics with the Project Rocket platform. In this pipeline, decoded video frames are filtered first using background subtraction detection and then low-resource DNN detection. Frames requiring further processing are passed through a heavy DNN detector.

Cascaded pipelines, like the one above, allow for very efficient processing of live video streams by filtering out frames with limited relevant information and being judicious about invoking resource-intensive operations. Plus, Rocket also makes it easy to ship the outputs of the video analytics, such as the number of relevant objects in an object-counting application, to a database for after-the-fact review.

diagram

The Project Rocket video analytics platform (above) is self-contained and allows people to plug in TensorFlow and Darknet DNN models to create pipelines for object detection, object counting, and the like to drive higher-level applications such as traffic prediction analysis and smart homes.

Making streets safer

Project Rocket has been focusing on smart cities as its driving application. In partnership with the city of Bellevue, Washington (opens in new tab), we used the framework to help make the city’s street system safer for drivers, riders, and pedestrians as part of its Vision Zero initiative to reduce traffic-related fatalities. With aggregate car and bicycle counts provided by a system built on the framework, for example, the city was able to assess the value of adding a bike lane (opens in new tab) to its downtown area.

One exciting traffic safety–related application we recently used it for, separate of our work with Bellevue, is a smart crosswalk. Using a live camera feed, the smart crosswalk (opens in new tab), which is in the prototype stage, is able to detect when a person in a wheelchair is in the middle of the crosswalk and extend the timer to allow the person to safely finish crossing.

Throughout our research and as we continue to develop Rocket, we’re devising privacy-protecting tools, including a “privacy protector” technique in which only those elements relevant to an application—for example, cars in a traffic-counting system—will be made available; background elements and other details, such as people, homes, businesses, and license plate numbers in the traffic-counting example, will be blacked out. Additionally, Rocket leverages the edge, and for all its benefits in enabling efficiency, we also see it as a means for keeping data in a trusted space—that is, on users’ premises.

Get the code and get to work!

The Rocket platform is Linux friendly. The code is written in .NET Core, which is compatible on Windows, as well as Linux. The Rocket repository also has simple instructions to create Docker containers, allowing for easy deployment using orchestration frameworks like Kubernetes. Docker containers are also readily compatible with appliances that bring computing to the edge, such as Microsoft Azure Stack Edge (opens in new tab). Additionally, Rocket has easy-to-use code for optionally invoking customized Azure Machine Learning (opens in new tab) models in the cloud.

Check out the code (opens in new tab) and give it a spin!

For more information, including a tutorial on how to get started building your own video analytics applications atop the platform, check out our Project Rocket webinar (opens in new tab), available on demand now.

继续阅读

查看所有博客文章