A Study on Efficient Robust Speech Recognition with Stochastic Dynamic Time Warping

孫, 喜浩


Hokkaido University \| Library \| HUSCAP	Advanced Search		言語

	Home
	About HUSCAP
	Open Access Policy

	Browse by Author

Browse
	Communities & Collections

	Scholarly Journals
	Theses
	Doctoral Dissertations Listed by Graduate Schools
	Conference Procs.
	Events

	HUSCAP Senior (in Japanese)

	Societies

	Downloads (country)

For university staff
	How to post your papers to HUSCAP
	Publication of theses
	Helpline about theses publication

Open Archives Compliant

You can search our collection also at:
	Google
	Google Scholar
	CiNii
	IRDB
	OAIster
	NDLTD

Hokkaido University Collection of Scholarly and Academic Papers >
Theses >
博士　（情報科学） >

A Study on Efficient Robust Speech Recognition with Stochastic Dynamic Time Warping

Files in This Item:

Xihao_Sun.pdf

1.29 MB

PDF

View/Open

Please use this identifier to cite or link to this item:https://doi.org/10.14943/doctoral.k11523

Related Items in HUSCAP:

論文内容及び審査の要旨
A Study on Efficient Robust Speech Recognition with Stochastic Dynamic Time Warping [an abstract of dissertation and a summary of dissertation review]

Title:	A Study on Efficient Robust Speech Recognition with Stochastic Dynamic Time Warping
Other Titles:	確率的DTWを用いた高効率ロバスト音声認識に関する研究
Authors:	孫, 喜浩¹ Browse this author
Authors(alt):	Sun, Xihao¹
Issue Date:	25-Sep-2014
Publisher:	Hokkaido University
Abstract:	In recent years, great progress has been made in automatic speech recognition (ASR) system. The hidden Markov model (HMM) and dynamic time warping (DTW) are the two main algorithms which have been widely applied to ASR system. Although, HMM technique achieves higher recognition accuracy in clear speech environment and noisy environment. It needs large-set of words and realizes the algorithm more complexly.Thus, more and more researchers have focused on DTW-based ASR system.Dynamic time warping (DTW) is based on template matching,it can accomplish time alignment of reference and test speech features by dynamic programming. Conventional DTW is fast and less complexity, however its recognition accuracy is limited. Therefore,Conventional DTW has mostly been used for speech recognition in clear environment.Recently, a DTW with multireferences (mDTW) algorithm has also been developed to improve the recognition accuracy in comparison to the hidden Markov model (HMM)algorithm under noisy conditions. However the mDTW algorithm increases the calculation cost and requires more memory resources which reduce the system practicability.It is possible to reconstruct the multireferences. The new method should be require less memory resources and reduce the calculation cost. Therefore, this study proposes a reconstruction method which add a training part to the DTW-based ASR system. The proposed reconstruction of references is aimed at making the DTW algorithm more effective. According to the DTW algorithm, the optimal warping path implies a minimumerror between any two given sequences. The algorithm that we have proposed will give us a way to build a new reference to replace the original two. This process will be done in three stages; First, for each reference word, speech utterances will be divided into two subsets. Second, for each pair of subsets, the optimal path will be computed and the new reference will replace the pair of subsets. Finally, the new references will be input to the DTW-based ASR system to get the recognition accuracy. The feasibility ofthe proposed technique was examined using computer simulations. The results demonstrated the effectiveness of the proposed technique. The simulation results show that our approach yields 96.94% accuracy compared with the 97.54% accuracy of mDTW in 20 dB white noise and 84.4% accuracy compared with 86.44% accuracy of mDTWin 10 dB white noise. Our approach yields 94.12% accuracy compared with 94.14% accuracy of mDTW in 20 dB babble noise and 80.82% accuracy compared with 81.64%accuracy of in 10 dB babble noise. Comparing our proposed technique to the mDTW,the calculation cost has been reduced 41.6%
Conffering University:	北海道大学
Degree Report Number:	甲第11523号
Degree Level:	博士
Degree Discipline:	情報科学
Examination Committee Members:	(主査) 教授宮永喜一, 特任教授野島俊雄, 特任教授小川恭孝, 教授齊藤晋聖, 准教授筒井弘
Degree Affiliation:	情報科学研究科（メディアネットワーク専攻）
Type:	theses (doctoral)
URI:	http://hdl.handle.net/2115/57251
Appears in Collections:	学位論文 (Theses) > 博士　（情報科学）課程博士 (Doctorate by way of Advanced Course) > 情報科学院(Graduate School of Information Science and Technology)

OAI-PMH ( junii2 , jpcoar_1.0 )

- Hokkaido University