2024-03-28T14:45:13Zhttps://eprints.lib.hokudai.ac.jp/dspace-oai/requestoai:eprints.lib.hokudai.ac.jp:2115/826002022-11-17T02:08:08Zhdl_2115_20045hdl_2115_139Multi-Modal Sensor Fusion-Based Semantic Segmentation for Snow Driving ScenariosVachmanus, SirawichRavankar, Ankit A.1000030440952Emaru, Takanori1000010186778Kobayashi, Yukinoriopen access© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.RoadsSnowImage segmentationSensorsSemanticsFeature extractionTrainingMachine learningsemantic segmentationthermal cameradata fusion548In recent years, autonomous vehicle driving technology and advanced driver assistance systems have played a key role in improving road safety. However, weather conditions such as snow pose severe challenges for autonomous driving and are an active research area. Thanks to their superior reliability, the resilience of detection, and improved accuracy, advances in computation and sensor technology have paved the way for deep learning and neural network-based techniques that can replace the classical approaches. In this research, we investigate the semantic segmentation of roads in snowy environments. We propose a multi-modal fused RGB-T semantic segmentation utilizing a color (RGB) image and thermal map (T) as inputs for the network. This paper introduces a novel fusion module that combines the feature map from both inputs. We evaluate the proposed model on a new snow dataset that we collected and on other publicly available datasets. The segmentation results show that the proposed fused RGB-T input can segregate human subjects in snowy environments better than an RGB-only input. The fusion module plays a vital role in improving the efficiency of multiple input neural networks for person detection. Our results show that the proposed network can generate a higher success rate than other state-of-the-art networks. The combination of our fused module and pyramid supervision path generated the best results in both mean accuracy and mean intersection over union in every dataset.IEEE (Institute of Electrical and Electronics Engineers)2021-08-01engjournal articleAMhttp://hdl.handle.net/2115/82600https://doi.org/10.1109/JSEN.2021.30770291530-437XIEEE sensors journal21151683916851https://eprints.lib.hokudai.ac.jp/dspace/bitstream/2115/82600/3/Final-Manuscript.pdfapplication/pdf3.06 MB2021-08-01