A study on stacked object recognition and stacking operation planning combining 3D point cloud representation, deep-learning and physics engine

許, 雅俊

doi:https://doi.org/10.14943/doctoral.k15552

A study on stacked object recognition and stacking operation planning combining 3D point cloud representation, deep-learning and physics engine

許, 雅俊

2023

Permalink : https://doi.org/10.14943/doctoral.k15552

このアイテムのアクセス数:506件（2026-03-18 01:23 集計）

閲覧可能ファイル

ファイル	フォーマット	サイズ	閲覧回数	説明
Yajun_Xu	pdf	6.76 MB	807

論文情報

ファイル出力 EndNote Basic出力 Mendeley出力

アクセス権	open access
DOI	https://doi.org/10.14943/doctoral.k15552
URI	https://hdl.handle.net/2115/89561
タイトル	A study on stacked object recognition and stacking operation planning combining 3D point cloud representation, deep-learning and physics engine
タイトル	3次元点群表現,深層学習および物理エンジンを用いた積み上げ物体認識と積み上げ作業計画に関する研究
著者	著者名許, 雅俊
言語	英語
キーワード	Instance segmentation
	Wave-dissipating blocks
	Point cloud
	Physics engine
	Deep learning
発行日	2023-03-23
出版者	Hokkaido University
ページ数	xiv, 101p.
抄録	Technically, three-demission (3D) data which provides richer geometric, shape, and scale information than 2D data, make it easier for machines to understand and interact with their surrounding environment. Typical 3D data include depth images, point clouds, meshes, and volumetric grids. Among them, point clouds are widely used in various fields, such as robotics, autonomous driving, and civil engineering, to preserve the original geometric information in 3D space without discretization. In some specific scenes, many objects are stacked on each other. For instance, in a robotic bin-picking scene, wherein heavily piled up parts occlude each other; on the coast, a large number of wave-dissipating blocks are stacked together in order to protect the embankment. Recognizing individual objects in these cluttered scenes poses a problem. Adding new objects based on the state of the stacked objects to address engineering requirements is an even more considerable challenge.In this dissertation, we design a 3D instance segmentation framework for stacked objects scenes using a deep neural network and then develop a system that simulates object stacking using a physics engine and deep learning to complete the object stacking plan based on our recognized results.Increasingly, deep learning on point clouds has attracted attention in recent years. 3D instance segmentation networks for indoor scenes have made some breakthroughs but still face several significant challenges. Several non-real-time deep learning-based 3D recognition frameworks for indoor scenes have been developed recently. However, deep learning of 3D point clouds still faces several significant challenges, such as data annotation, the memory required to process large-scale point clouds, and time-consuming processing. We propose a fast point cloud clustering-based deep neural network, FPCC, for the instances segmentation of stacked objects. The network simultaneously predicts the similarity of points and the likelihood of being centroids. Based on the predicted results, this study designs a novel clustering algorithm that can quickly generate the final segmentation results.Experimental results on public datasets show that the proposed method has excellent performance, reaching the current state-of-the-art precision and processing speed.Then, we extend the application scenario of this 3D instance segmentation scheme to the recognition of wave-dissipating blocks, a structural unit of breakwaters. Compared with the current methods that minor the whole structure of the breakwater, our method can minor the blocks at the instance level. The recognition consists of three main steps: point cloud instance segmentation of the blocks, pose estimation, and classification. Anovel point cloud feature extractor is designed to replace the original feature extractor of FPCC, which can process more points faster with the same computational overhead. The new feature extractor employs an attention-pooling mechanism, which allows the neural network to learn richer local information. Then, the block-wise 6D pose is estimated using a three-dimensional feature descriptor, point cloud registration, and CAD models of blocks. Finally, the type of each segmented block is classified using model registration results.The pose estimation results on real-world data showed that the fitting error between the reconstructed scene and the scene point cloud ranged between 30 and 50 mm, which is below 2% of the detected block size. The accuracy in the block-type classification on real- world point clouds reached about 95%. These block detection performances demonstrate the effectiveness of our approach. Finally, based on the recognized results of wave-dissipating blocks, a system is devel-oped to simulate the block stacking plan utilizing a physics engine and deep learning, which can predict the additional block amounts and their stacking poses and provide pre-visualization of their stacking operations. Deep learning was used to estimate the ad- ditional block poses that better fit the stacked blocks. The simulation was applied to an actual block-stacking operation in a local port at Hokkaido. The final construction results in the real world verified the accuracy and usefulness of the simulation.This dissertation generally makes three major contributions to object recognition and object stacking simulation. The first one is to propose a fast framework for point cloud instance segmentation called FPCC. The second major contribution is improving FPCC and its use for stacked wave-dissipating block scenes. Combined with pose estimation, this enables us to accurately retrieve the majority of the blocks in a 3D scene, minoring the blocks at the instance level. The third major contribution is the development of a simulation system for simulating the block supplementation project, which provides customizable pre-visualization results and blocks stacking solutions according to different construction requirements.
学位授与機関	識別子 10101 機関名北海道大学 Hokkaido University
学位授与年月日	2023-03-23
学位授与番号	甲第15552号
学位名	博士(情報科学)
学位の審査委員	(主査) 教授金井理, 教授小野里雅彦, 教授近野敦, 准教授伊達宏昭
学位審査の研究科等	情報科学院(情報科学専攻)
資源タイプ	博士論文
出版タイプ	Version of Record
関連情報 (isReferencedBy)	HDL https://hdl.handle.net/2115/89552