Breakthrough in Spatial Intelligence! Fei-Fei Li’s Team Uses Orbbec Cameras to Enable Robots for “Precision Tasks”

Share

Recently, the globally leading AI research team led by Professor Fei-Fei Li at Stanford University unveiled a groundbreaking achievement in the field of “spatial intelligence”=ReKep  (Relational Keypoint Constraints). This advanced system demonstrated significant potential in robotic manipulation, automation of household tasks, and beyond. In this study, the team employed the Orbbec RGB-D camera, the Femto Bolt, to accurately capture both color images and depth data from experimental settings. This enabled the ReKep to recognize and locate objects and their key points within the scene, providing essential 3D visual input for robot action optimization and complex interactions.

According to their research, ReKep is an optimization framework that transforms complex tasks into a series of relational keypoint constraints, represented as Python functions. These functions map 3D keypoints in the environment into numerical cost values, capturing spatially and semantically meaningful data that empowers robots with autonomous decision-making capabilities. This allows them to address core challenges in each task. The high-quality RGB and depth data from Femto Bolt were essential for ReKep system to accurately map environmental 3D keypoints and define these relational constraints.

Depth Sensing for Precise Planning

The Femto Bolt is an RGB-D camera based on Microsoft’s advanced Time-of-Flight (ToF) sensor technology. The RGB images assist the ReKep system in target detection and recognition, while the depth images provide highly accurate 3D spatial information, allowing the ReKep to distinguish objects’ spatial positioning and distance relationships. The combination of these inputs enables the ReKep system to fully comprehend the operational environment, generating 3D coordinates for key points to guide robotic decision-making and execution.

The extraction of 3D keypoints demands exceptional data quality, especially regarding depth precision and point cloud accuracy. Femto Bolt delivers 4K high-resolution images, significantly reducing false identifications and enhancing interaction reliability—crucial for the execution of fine, complex robotic operations.

Low Latency for Smooth Interactions

For smooth human-machine interaction, the ReKep system requires fast capture and transmission of 3D keypoints. The Femto Bolt achieves this with 30 frames per second (fps) depth data capture, integrating multimodal depth, color imaging modules, and an inertial sensor. This combination enables ultra-low latency, real-time feedback, ensuring that robots can respond rapidly to the ReKep system’s commands.

Multi-Device Synchronization for Efficient Collaboration

The research team conducted experiments on various robotic platforms, including single- and dual-arm robots executing multi-stage, dual-hand cooperative tasks. The Femto Bolt supports multi-device synchronization, enabling a larger spatial capture range and facilitating efficient collaboration among multiple robots to complete shared tasks.

For synchronization, the Femto Bolt uses a versatile 8-pin GPIO interface with extensive expansion capabilities. It is also equipped with a lockable USB-C interface, ensuring reliable power supply and data transmission, which enhances the system’s stability and security.

Deep Integration with Multimodal Large Models

Notably, Fei-Fei Li’s team also integrated visual models with vision-language models such as GPT-4o (OpenAI’s language model behind ChatGPT), showcasing deep integration between visual perception and robotic learning. Even without specific task data or detailed environmental models, ReKep demonstrated excellent generalization in unstructured environments, showcasing its adaptability across various task strategies.

With over 8 years of experience in commercial robotic visiondeployment, Orbbec is committed to developing the “eyes” for robots. Earlier this year, Orbbec partnered with NVIDIA to integrate the advanced iToF technology of the Femto Bolt camera with the AI computing power of NVIDIA Orin AGX, coupled with a Universal Robots UR5 robotic arm. (Watch the video here)This collaboration set a new industry benchmark for high-precision object detection, picking, and placemen)t in warehouse automation. In its pursuit of multimodal large model technology (spanning voice, text, and vision) and robotic arm control, Orbbec’s R&D team has launched the 2.0 version of its large-model robotic arm, capable of accurately identifying everyday objects and interpreting and executing commands.

Looking ahead, Orbbec remains committed to advancing the application of robotic vision sensor technologies, supporting the development of spatial intelligence systems, and driving further breakthroughs in robotic vision and artificial intelligence. This will pave the way for a wider range of intelligent applications.

Orbbec Camera Femto Bolt and its Large-Model Robotic Arm Point Cloud Effect Diagram

 

Part of the content is sourced from:

ReKep | Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation (rekep-robot.github.io)

Related News

Orbbec Depth+RGB cameras are now available for AMD Kria KR260 Robotics Starter Kits

Orbbec announced the availability of the Orbbec 3D cameras with the AMD Kria™ KR260 Robotic Starter Kits, the development platform for production AMD Kria K26 SOMs.

Orbbec launches out-of-the-box AMR solution with NVIDIA Isaac Perceptor at ROSCon 2024

The Orbbec Perceptor Developer Kit (OPDK), an out-of-the-box autonomous mobile robot (AMR) solution preloaded with NVIDIA Isaac Perceptor.

ROSCon 2024: Orbbec Unveils GMSL2/FAKRA Stereo Vision 3D Camera Gemini 335Lg for AMRs and Robotic Arms

Featuring GMSL2 serializer and FAKRA connector, the Gemini 335Lg offers secure and reliable connectivity, designed to enhance the performance of autonomous mobile robots (AMRs) and robotic arms in challenging environments.

Stay updated

Be the first to learn about our new
products and updates.