Disfluencies Detection and Reconstruction of Spontaneous Speech Transcripts
2pm
Room 5566 (Lifts 27-28), 5/F Academic Building, HKUST

Supporting the below United Nations Sustainable Development Goals:支持以下聯合國可持續發展目標:支持以下联合国可持续发展目标:

Examination Committee

Prof Chin-Tau LEA, ECE/HKUST (Chairperson)
Prof Pascale FUNG, ECE/HKUST (Thesis Supervisor)
Prof Tim CHENG, ECE/HKUST 

 

Abstract

Disfluencies in spoken language remain a challenge to both language processing applications and human perception. Disfluency identification and removal are therefore beneficial steps to improve performance of spoken language understanding tasks. In particular, it is important for close captioning video conferences.
 
We investigated different approaches for disfluency identification and removal, ranging from rule-based, translation models and supervised classification using either Conditional Random Fields (CRF) or Deep Neural Networks (DNN). 
 
As supervised classifier in our task requires huge amount of human annotation and labeling, the rule-based approach and translation model allow us to use less human labeling than supervised classification.
 
In the rule-based approach, we used regular expressions for matching simple disfluent words and obtained 81.01 in the Bilingual Evaluation Understudy (BLEU) score, a way to scale the reconstruction quality.
 
Secondly, we used a Weighted Finite State Transducer (WFST) to translate phrases that have complex disfluencies into more fluent ones and obtained 72.02 score in BLEU measure.
 
We applied for the first time DNN to the same features as its rival's and obtained better performance for one disfluency type, repeat, with precision at 81.7% and recall 82.0%. Furthermore, the harmonic mean F1 score for overall weighted disfluencies was improved by 3.7% when we used CRF system with additional word vector feature. We constructed a DNN-CRF hybrid system by using voting algorithm. For the disfluency type, false start, our F1 result is 52.2%. This is significant improvement over a CRF system's baseline 36.7%.
 
It has been found that BLEU score is improved from 81.01 to 82.80 by elaborately combining all the three approaches.

讲者/ 表演者:
Mr Linlin WANG
语言
英文
新增活动
请各校内团体将活动发布至大学活动日历。