Recently, the globally leading AI research team led by Professor Fei-Fei Li at Stanford University unveiled a groundbreaking achievement in the field of “spatial intelligence”=ReKep (Relational Keypoint Constraints). This advanced system demonstrated significant potential in robotic manipulation, automation of household tasks, and beyond. In this study, the team employed the Orbbec RGB-D camera, the Femto Bolt, to accurately capture both color images and depth data from experimental settings. This enabled the ReKep to recognize and locate objects and their key points within the scene, providing essential 3D visual input for robot action optimization and complex interactions.
According to their research, ReKep is an optimization framework that transforms complex tasks into a series of relational keypoint constraints, represented as Python functions. These functions map 3D keypoints in the environment into numerical cost values, capturing spatially and semantically meaningful data that empowers robots with autonomous decision-making capabilities. This allows them to address core challenges in each task. The high-quality RGB and depth data from Femto Bolt were essential for ReKep system to accurately map environmental 3D keypoints and define these relational constraints.
–
Depth Sensing for Precise Planning
The Femto Bolt is an RGB-D camera based on Microsoft’s advanced Time-of-Flight (ToF) sensor technology. The RGB images assist the ReKep system in target detection and recognition, while the depth images provide highly accurate 3D spatial information, allowing the ReKep to distinguish objects’ spatial positioning and distance relationships. The combination of these inputs enables the ReKep system to fully comprehend the operational environment, generating 3D coordinates for key points to guide robotic decision-making and execution.
The extraction of 3D keypoints demands exceptional data quality, especially regarding depth precision and point cloud accuracy. Femto Bolt delivers 4K high-resolution images, significantly reducing false identifications and enhancing interaction reliability—crucial for the execution of fine, complex robotic operations.
–
Low Latency for Smooth Interactions
For smooth human-machine interaction, the ReKep system requires fast capture and transmission of 3D keypoints. The Femto Bolt achieves this with 30 frames per second (fps) depth data capture, integrating multimodal depth, color imaging modules, and an inertial sensor. This combination enables ultra-low latency, real-time feedback, ensuring that robots can respond rapidly to the ReKep system’s commands.
–
Multi-Device Synchronization for Efficient Collaboration
The research team conducted experiments on various robotic platforms, including single- and dual-arm robots executing multi-stage, dual-hand cooperative tasks. The Femto Bolt supports multi-device synchronization, enabling a larger spatial capture range and facilitating efficient collaboration among multiple robots to complete shared tasks.
For synchronization, the Femto Bolt uses a versatile 8-pin GPIO interface with extensive expansion capabilities. It is also equipped with a lockable USB-C interface, ensuring reliable power supply and data transmission, which enhances the system’s stability and security.
–
Deep Integration with Multimodal Large Models
Notably, Fei-Fei Li’s team also integrated visual models with vision-language models such as GPT-4o (OpenAI’s language model behind ChatGPT), showcasing deep integration between visual perception and robotic learning. Even without specific task data or detailed environmental models, ReKep demonstrated excellent generalization in unstructured environments, showcasing its adaptability across various task strategies.
With over 8 years of experience in commercial robotic visiondeployment, Orbbec is committed to developing the “eyes” for robots. Earlier this year, Orbbec partnered with NVIDIA to integrate the advanced iToF technology of the Femto Bolt camera with the AI computing power of NVIDIA Orin AGX, coupled with a Universal Robots UR5 robotic arm. (Watch the video here)This collaboration set a new industry benchmark for high-precision object detection, picking, and placemen)t in warehouse automation. In its pursuit of multimodal large model technology (spanning voice, text, and vision) and robotic arm control, Orbbec’s R&D team has launched the 2.0 version of its large-model robotic arm, capable of accurately identifying everyday objects and interpreting and executing commands.
Looking ahead, Orbbec remains committed to advancing the application of robotic vision sensor technologies, supporting the development of spatial intelligence systems, and driving further breakthroughs in robotic vision and artificial intelligence. This will pave the way for a wider range of intelligent applications.
Orbbec Camera Femto Bolt and its Large-Model Robotic Arm Point Cloud Effect Diagram
Part of the content is sourced from: