OpenPCDet: Open-MMLab 面向LiDAR点云表征的3D目标检测代码库
OpenPCDet: Open-MMLab 3D Object Detection Code Library for LiDAR Point Cloud Characterization
随着自动驾驶与机器人技术的不断发展,基于点云表征的3D目标检测领域在近年来取得了不断的发展。然而,层出不穷的点云数据集(KITTI、NuScene、Lyft、Waymo、PandaSet等)在数据格式与3D坐标系上往往定义各不相同,各式各样的点云感知算法(point-based、 voxel-based、one-stage/two-stage等)也形态各异,使得相关研究者难以在一个统一的框架内进行各种组合实验。
With the development of autonomous driving and robotics, 3D object detection based on point cloud representation has made great progress in recent years. However, the endless stream of point cloud datasets (KITTI, NuScene, Lyft, Waymo, PandaSet, etc.) are often defined differently in data formats and 3D coordinate systems, and various point cloud perception algorithms (point-based, voxel- based, one-stage/two-stage, etc.) are also different in form, making it difficult for relevant researchers to conduct various combination experiments within a unified framework.
为此,我们开源了一套基于PyTorch实现的点云3D目标检测代码库 - OpenPCDet:
We have open sourced a set of PyTorch-based point cloud 3D object detection code library - OpenPCDet:
https://github.com/open-mmlab/OpenPCDetgithub.com
Https://github.com/open-mmlab/OpenPCDetgithub. Com
其主要包括了全新改版的 PCDet (v0.2) 点云3D目标检测框架 (包括我们首次开源的 PV-RCNN 3D目标检测算法)。
It mainly includes the newly revised PCDet (v0.2) point cloud 3D target detection framework (including the first open source PV-RCNN 3D target detection algorithm).
接下来,我们主要介绍 PCDet 3D目标检测框架的整体结构设计与优势,以及如何添加新的数据集、如何组合\研发新的model等简要使用说明。
Next, we will introduce the overall structural design and advantages of the PCDet 3D object detection framework, as well as brief usage instructions on how to add new datasets, how to combine/develop new models, etc.
PCDet 3D目标检测框架介绍
Introduction to PCDet 3D Object Detection Framework
数据-模型分离的顶层代码框架设计思想
Top-level code framework design idea of data-model separation
不同于图像处理,点云3D目标检测中不同数据集的繁多3D坐标定义与转换往往使研究者迷失其中。为此,PCDet定义了统一的规范化3D坐标表示贯穿整个数据处理与模型计算,从而将数据模块与模型处理模块完全分离,其优势体现在: (1) 研究者在研发不同结构模型时,统一使用标准化的3D坐标系进行各种相关处理(比如计算loss、RoI Pooling和模型后处理等),而无需理会不同数据集的坐标表示差异性;(2) 研究者在添加新数据集时,只需写少量代码将原始数据转化到标准化坐标定义下,PCDet将自动进行数据增强并适配到各种模型中。
Different from image processing, the numerous 3D coordinate definitions and transformations of different datasets in point cloud 3D object detection often confuse researchers. Therefore, PCDet defines a unified normalized 3D coordinate throughout the entire data processing and model calculation, thereby completely separating the data module from the model processing module. The advantages of this method are as follows: (1) When developing different structural models, researchers uniformly use a standardized 3D coordinate system for various related processing (such as calculating loss, RoI Pooling and model post-processing, etc.), without paying attention to the differences in the coordinate representation of different datasets (2) When researchers add new data sets, they only need to write a small amount of code to convert the original data into standardized coordinate definitions, and PCDet will automatically perform data enhancement and adapt to various models.
PCDet 数据-模型分离的顶层设计,使得研究者可以轻松适配各种模型到不同的点云3D目标检测数据集上,免去研发模型时迷失在3D坐标转换中的顾虑。
The top-level design of PCDet data-model separation allows researchers to easily adapt various models to different point cloud 3D target detection data sets, avoiding getting lost in 3D coordinate transformation when developing models.
统一的3D目标检测坐标定义
Unified 3D object detection coordinate definition
不同的点云数据集在坐标系以及3D框的定义上往往不一样(KITTI数据集中的camera和LiDAR两个坐标系的混用也常使新手迷茫),因此在 PCDet 中我们采用了固定的统一点云坐标系(如图1右下角所示),以及更规范的3D检测框定义,贯穿整个数据增强、处理、模型计算以及检测后处理过程。3D检测框的7维信息定义如下(如图2所示):
Different point cloud datasets often have different definitions of coordinate systems and 3D boxes (the combination of camera and LiDAR coordinate systems in the KITTI dataset often confuses novices). Therefore, in PCDet, we use a fixed unified point cloud coordinate system (as shown in the lower right corner of Figure 1) and a more standardized 3D detection frame definition throughout the entire data enhancement, processing, model calculation and detection post-processing process. The 7-dimensional information of the 3D detection frame is defined as follows (as shown in Figure 2):
其中,(cx, cy, cz) 为物体3D框的几何中心位置,(dx, dy, dz)分别为物体3D框在heading角度为0时沿着x-y-z三个方向的长度,heading为物体在俯视图下的朝向角 (沿着x轴方向为0度角,逆时针x到y角度增加)。
Among them, (cx, cy, cz) is the geometric center position of the 3D frame of the object, (dx, dy, dz) are the lengths of the 3D frame of the object along the three directions of x-y-z, and heading is the heading angle of the object in the top view (0 degrees along the x-axis, increasing counterclockwise from x to y).
基于 PCDet 所采用的标准化3D框定义,我们再也不用纠结到底是物体3D中心还是物体底部中心;再也不用纠结物体三维尺寸到底是l-w-h排列还是w-l-h排列;再也不用纠结heading 0度角到底是哪,到底顺时针增加还是逆时针增加。
Based on the standardized 3D frame definition adopted by PCDet, there is no need to worry about whether it is the 3D center of the object or the bottom center of the object; there is no need to worry about whether the three-dimensional size of the object is l-w-h arrangement or w-l-h arrangement; there is no need to worry about which heading 0 degree angle is, whether it increases clockwise or counterclockwise.
灵活全面的模块化模型拓扑设计
Flexible and comprehensive modular model topology design
基于图3所示的灵活且全面的模块化设计,我们在PCDet中搭建3D目标检测框架只需要写config文件将所需模块定义清楚,然后PCDet将自动根据模块间的拓扑顺序组合为3D目标检测框架,来进行训练和测试。
Based on the flexible and comprehensive modular design shown in Figure 3, we only need to write a config file to define the required modules to build a 3D target detection framework in PCDet, and then PCDet will automatically combine into a 3D object detection framework according to the topological order among modules for training and testing.
基于图3所示框架,PCDet可以支持目前已有的绝大多数面向LiDAR点云的3D目标检测算法,包括voxel-based,point-based,point-voxel hybrid以及one-stage/two-stage等等3D目标检测算法(参见图4示例图)。
Based on the framework shown in Figure 3, PCDet can support most of the existing 3D object detection algorithms for LiDAR point clouds, including voxel-based, point-based, point-voxel hybrid and one-stage/two-stage (Fig. 4 example images).
清晰简洁的代码结构
Concise code structure
PCDet全新重构了基于numpy+PyTorch的数据增强模块与数据预处理模块,依托data_augmentor与data_processor两个基类可灵活添加、删除各种数据增强与预处理操作。
PCDet has newly reconstructed the data enhancement module and data preprocessing module based on numpy+PyTorch. Relying on the two base classes data_augmentor and data_processor, various data enhancement and preprocessing operations can be flexibly added and deleted.
我们在重构PCDet代码时,尽量做到代码结构清晰简洁,用最简单的python+pytorch完成整个结构(涉及的CUDA代码也都提供了明确接口定义),从而更好的让研究者轻松理解代码逻辑和修改使用。
When refactoring the PCDet code, we try to keep the code structure as simple as possible, and use the simplest python+pytorch to complete the entire structure (the CUDA code involved also provides a clear interface definition), so as to better allow researchers to understand the code logic.
更强的3D目标检测性能
Powerful 3D object detection performance
作为最早开源二阶段3D点云目标检测代码的团队之一,我们不断提出了PointRCNN、PartA2-Net、PV-RCNN等高性能3D目标检测算法。在这次PCDet代码更新中,我们首次开源了PV-RCNN算法,其目前仍是在KITTI+Waymo榜上性能最强的纯点云3D目标检测算法。
As one of the earliest teams to open source two-stage 3D point cloud target detection code, we have continuously proposed high-performance 3D target detection algorithms such as PointRCNN, PartA2-Net, and PV-RCNN. In this PCDet code update, we open-sourced the PV-RCNN algorithm for the first time, which is still the most performant pure point cloud 3D target detection algorithm on the KITTI+Waymo list.
希望我们在PCDet中开源的多个高性能3D目标检测算法可以为各位研究者提供更强的baseline算法,并成为大家的比赛刷榜利器。
It is hoped that the multiple high-performance 3D target detection algorithms we have open sourced in PCDet can provide researchers with stronger baseline algorithms.
如何支持新的数据集?
How to support new datasets?
如之前所说,PCDet的数据-模型分离框架设计与规范化的坐标表示使得其很容易扩展到新的数据集上。具体来说,研究者只需要在自己的dataloader里面做以下两件事:
As mentioned before, PCDet's data-model separation framework design and normalized coordinate representation make it easy to extend to new datasets. Specifically, researchers only need to do the following two things in their dataloader:
(1) 在self._getitem_()中加载自己的数据,并将点云与3D标注框均转至前述统一坐标定义下,送入数据基类提供的self.prepare_data();
(1) Load your data in self._getitem_(), transfer the point cloud and 3D annotation frame to the aforementioned unified coordinate definition, and send them to self.prepare_data() provided by the data base class;
(2) 在self.generate_prediction_dicts()中接收模型预测的在统一坐标系下表示的3D检测框,并转回自己所需格式即可。
(2) Receive the 3D detection frame predicted by the model in the unified coordinate system in self.generate_prediction_dicts(), and convert it back to the required format.
如何组合、改进旧模型+支持新的模型?
How to improve the old model to support the new model?
如图3所示,PCDet中实际上已经支持了绝大部分的模块。对于一个新的(组合的)3D检测模型来说,只要在PCDet框架中实现其所特有的模块(比如新的backbone或新的head)来替换掉原有模块,并修改响应模型配置文件,其他模块以及数据处理部分直接利用PCDet中已有部分即可。
As shown in Figure 3, most of the modules have actually been supported in PCDet. For a new (combined) 3D detection model, just implement its unique module (such as a new backbone or a new head) in the PCDet framework to replace the original module, and modify the response model configuration file, other modules and data processing parts can directly use the existing parts in PCDet.
总结
Conclusion
OpenPCDet开源项目旨在为学术界和工业界提供一个更灵活、全面、高效的点云3D目标检测代码框架,也希望吸引更多的研究者参与进来支持更多的算法与数据集,从而推动这个领域的不断发展。
The OpenPCDet open source project aims to provide a more flexible, comprehensive and efficient code framework for point cloud 3D object detection for academia and industry, and also hopes to attract more researchers to participate in supporting more algorithms and datasets.