Uncertainty Quantification and its Applications for Multimodal Semantic Segmentation
Neural networks are commonly used to solve scene understanding problems such as semantic segmentation or object detection. Although neural networks can achieve outstanding accuracy, prediction errors can still happen with a certain probability. Therefore, it is essential to quantify the uncertainty. Besides that, uncertainty quantification also has an important role during optimization of neural networks. Furthermore, it is highly desirable to represent uncertainty in any system where these models are used, in a trustworthy manner.
This thesis is about uncertainty quantification and its applications for multimodal semantic segmentation in the area of automated driving.
The first contribution is a new active learning method for semantic segmentation of images. In this method, only quadratic image regions instead of whole ones are queried, which follow two selection criteria: it is assumed that the image regions cannot be segmented correctly, and that the annotation costs are low. Therefore, uncertainty quantification is applied to identify images with a low predicted segmentation quality. For the second criterion, a practical cost estimation approach is introduced.
The second contribution is a new uncertainty quantification method for semantic segmentation of lidar point clouds. This method aims to detect false positive segments as well as to provide a segment-wise prediction quality estimation. This yields an uncertainty quantification per predicted point but also a quality estimation on predicted segment level.
The third part of this work describes a new procedure for automated generation of high definition maps, which can be seen as semantic segmentation of the road environment. In this procedure, object detection and tracking are applied to detect road users. The detected road users as well as the driving path of the recording vehicle are aggregated and the map features, for example lanes, are extracted from this aggregated data. By aggregating data from multiple recordings and taking uncertainty measures into account, highly reliable high definition maps are generated.