This document fully reuses the Azure Kinect DK image transformation capabilities, mainly implementing D2C, C2D, depth to point cloud functions.
k4a_transformation functions
k4a_transformation is used in Azure Kinect DK to coordinate between camera systems for use and transformation of images. All functions prefixed with k4a_transformation operate on entire images. They require a transformation handle k4a_transformation_t obtained through k4a_transformation_create() and released via k4a_transformation_destroy(). You can also refer to the SDK transformation example which demonstrates using these functions.
This document will introduce the following functions:
- k4a_transformation_depth_image_to_color_camera()
- k4a_transformation_depth_image_to_color_camera_custom()
- k4a_transformation_color_image_to_depth_camera()
- k4a_transformation_depth_image_to_point_cloud()
k4a_transformation_depth_image_to_color_camera
Overview
The function k4a_transformation_depth_image_to_color_camera() transforms the depth map from the viewpoint of the depth camera to the viewpoint of the color camera. This function is intended to produce so-called RGB-D images, where D is an additional image channel recording depth values. As seen in the image below, the output of k4a_transformation_depth_image_to_color_camera() is a depth image that looks as if it was captured from the same viewpoint as the color image (i.e. the viewpoint of the color camera).
Implementation
This transformation function is more complex than simply calling k4a_calibration_2d_to_2d() individually for each pixel. It warps the triangular mesh geometry of the depth camera into the geometry of the color camera. Using the triangular mesh avoids holes in the transformed depth image. A z-buffer ensures correct handling of occlusions. GPU acceleration is enabled for this function by default.
Parameters
The input parameters are the transformation handle and depth image. The depth image resolution must match the depth_mode specified when the transformation handle was created. For example, if the transformation handle was created with 1024×1024 K4A_DEPTH_MODE_WFOV_UNBINNED mode, the depth image resolution must be 1024×1024 pixels. The output is the transformed depth image which needs to be allocated by the user through a call to k4a_image_create(). The transformed depth image resolution must match the color_resolution specified when creating the transformation handle. For example, if the color resolution was set to K4A_COLOR_RESOLUTION_1080P, the output image resolution must be 1920×1080 pixels. The output image stride is set to width * sizeof(uint16_t) since the image stores 16-bit depth values.
k4a_transformation_depth_image_to_color_camera_custom
Overview
The function k4a_transformation_depth_image_to_color_camera_custom() transforms the depth map and a custom image from the viewpoint of the depth camera to the viewpoint of the color camera. As an extension of k4a_transformation_depth_image_to_color_camera(), this function is intended to produce a corresponding custom image with each pixel matched to the corresponding pixel coordinate in the color camera.
Implementation
This transformation function produces the transformed depth image in the same way as k4a_transformation_depth_image_to_color_camera(). To transform the custom image, this function provides the option of linear interpolation or nearest neighbor interpolation. Linear interpolation can create new values in the transformed custom image. Nearest neighbor interpolation prevents values not present in the original image from appearing in the output, but results in a less smooth image. The custom image should be single channel 8 or 16 bit. GPU acceleration is enabled for this function by default.
Parameters
The input parameters are the transformation handle, depth image, custom image, and interpolation type. The depth image and custom image resolutions must match the depth_mode specified when creating the transformation handle. For example, if the transformation handle was created with 1024×1024 K4A_DEPTH_MODE_WFOV_UNBINNED mode, the depth image and custom image resolutions must be 1024×1024 pixels. Theinterpolation_type should be K4A_TRANSFORMATION_INTERPOLATION_TYPE_LINEAR or K4A_TRANSFORMATION_INTERPOLATION_TYPE_NEAREST. The outputs are the transformed depth image and transformed custom image which need to be allocated by the user through calls to k4a_image_create(). The transformed depth image and transformed custom image resolutions must match the color_resolution specified when creating the transformation handle. For example, if the color resolution was set to K4A_COLOR_RESOLUTION_1080P, the output image resolutions must be 1920×1080 pixels. The output depth image stride is set to width * sizeof(uint16_t) since the image stores 16-bit depth values. The input custom image and output transformed custom image should have format K4A_IMAGE_FORMAT_CUSTOM8 or K4A_IMAGE_FORMAT_CUSTOM16 with appropriate strides set.
k4a_transformation_color_image_to_depth_camera
Overview
The function k4a_transformation_color_image_to_depth_camera() transforms the color image from the viewpoint of the color camera to the viewpoint of the depth camera (see image above). This function can be used to produce RGB-D images.
Implementation
For each pixel in the depth map, this function uses the pixel’s depth value to calculate the corresponding subpixel coordinate in the color image. We then lookup the color value at that coordinate in the color image. Bilinear interpolation is performed in the color image to get subpixel precision for the color values. Pixels without an associated depth reading are assigned the BGRA value [0,0,0,0] in the output image. GPU acceleration is enabled for this function by default. Since this method can result in holes in the transformed color image, and does not handle occlusions, we recommend using the function k4a_transformation_depth_image_to_color_camera() instead.
Parameters
The input parameters are the transformation handle, depth image, and color image. The depth image and color image resolutions must match the depth_mode and color_resolution specified when creating the transformation handle. The output is the transformed color image which needs to be allocated by the user through a call to k4a_image_create(). The transformed color image resolution must match the depth_resolution specified when creating the transformation handle. The output image stores four 8-bit values (BGRA for each pixel). So the image stride is width * 4 * sizeof(uint8_t). The data order is pixel interleaved – i.e. blue value – pixel 0, green value – pixel 0, red value – pixel 0, alpha value – pixel 0, blue value – pixel 1, etc.
k4a_transformation_depth_image_to_point_cloud
Overview
The function k4a_transformation_depth_image_to_point_cloud() transforms the 2D depth map captured by the camera into a 3D point cloud in the coordinate system of the same camera. So the camera can be either the depth camera or color camera.
Implementation
The result of this function is the same as running k4a_calibration_2d_to_2d() individually for each pixel but is more efficient to compute. When calling k4a_transformation_create(), we precompute a so-called xy lookup table which stores the x and y scale factors for each image pixel. When calling k4a_transformation_depth_image_to_point_cloud(), we compute the 3D X coordinate for a pixel by multiplying the pixel’s x scale factor by the pixel’s Z coordinate. Similarly, multiplying by the y scale factor gives the 3D Y coordinate. The SDK point cloud fast sample demonstrates computing the xy table. For example, users could implement their own version of this function following the sample code to accelerate their GPU pipeline.
Parameters
The input parameters are the transformation handle, camera specifier, and depth image. If the camera specifier is set to depth, the depth image resolution must match the depth_mode specified when creating the transformation handle. Otherwise, if set to the color camera, the resolution must match that of the chosen color_resolution. The output parameter is the XYZ image which needs to be allocated by the user through a call to k4a_image_create(). The XYZ image resolution must match that of the input depth map. We store three signed 16-bit coordinates (in millimeters) for each pixel. So the XYZ image stride is set to width * 3 * sizeof(int16_t). The data order is pixel interleaved – i.e. X coordinate – pixel 0, Y coordinate – pixel 0, Z coordinate – pixel 0, X coordinate – pixel 1, etc. If a pixel cannot be transformed to 3D, the function assigns it the value [0,0,0].
Example
Next Steps
Now that you know how to use the Femto Bolt 3D camera SDK image transformation functions, you can also learn about:《Calibration Functions》