Thyroid Ultrasound Report Generation Based on Cyrillic Mongolian Speech Recognition

来源 :东华大学 | 被引量 : 0次 | 上传用户:RaymanL
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Language and speech are the most important and direct ways of human communication,and they have an irreplaceable role in our daily life.With the development of deep learning and the continuous advancement of artificial intelligence technology,peoples requirements for speech recognition are getting higher and higher,which has led to a series of research and development for speech recognition systems.Deep Learning(DL),as the most concerned machine learning model in recent years,has achieved amazing results in many fields such as speech recognition,image processing and so on.As these days where the newest technologies are widely being used in every developed country,the Cyrillic Mongolian speech recognition system has a significant purpose to be designed and to use in Mongolia.There are not only a large number of synonyms and homonyms in the Mongolian language but also complicated grammar and question structures.These factors need to be taken into account in the process of speech recognition.The training is difficult and the recognition effect is not ideal.At present,in the field of speech recognition,more and more acoustical models are constructed by neural networks and are studied in depth.Among them,Deep Neural Network(DNN)is the mainstream acoustic model.
  The purpose of this study was to establish high-efficiency thyroid ultrasound report generation based on Cyrillic Mongolian speech recognition,aiming to improve Mongolian medical field technology status.We suggest that the Cyrillic Mongolian speech input system for the Thyroid Ultrasound report mainly consists of a speech input system,Cyrillic Mongolian speech recognition system,and thyroid ultrasound report.The system generates digital speech by collecting the voice data during the doctors examination through a microphone,and then the Cyrillic Mongolian speech recognition system will convert the speech data into text and the formed text will be output in the form of a report.As for the Cyrillic Mongolian speech recognition system based on convolution neural network:Convolution Neural Network(CNN)has a unique convolution pooling layer,which can reduce the number of parameters in the training process,better deal with a large number of the Cyrillic Mongolian data processing process,reduce the complexity of the model,and more suitable for Mongolian speech recognition process.Therefore,to improve the accuracy of the Cyrillic Mongolian speech recognition,the Cyrillic Mongolian speech recognition system based on deep convolution neural network acoustic model was designed and constructed.Study results show that:
  (1)Aiming at the phenomenon of mandatory alignment of speech in the training process of traditional acoustic models,combined with the end-to-end structure,an end-to-end convolutional neural network(CTC-CNN)acoustic model was proposed to optimize the likelihood of input and output sequences.The experimental results show that the error rate of the Cyrillic Mongolian speech recognition system based on the CTC-CNN acoustic model is17.7%.Compared with the Cyrillic Mongolian speech recognition system based on CNN acoustic model,the accuracy is improved by1.2%.
  (2)In the CTC-CNN model,CNN is a two-layer convolution structure with shallow layers.The recognition effect of the shallow convolution neural network model is limited.To further improve the accuracy,an end-to-end depth convolution neural network(CTC-DCNN)model was designed based on the residual block structure.The model gradient disappearance phenomenon is improved by maxout function optimization.A new improved acoustic model of end-to-end deep convolution neural network(CTC-DCNN optimization)was proposed to improve the accuracy and modeling ability of the network.The experimental results show that compared with the CNN model,this model has a4%t04.7%reduction in word error rate in speech recognition.
其他文献
被子植物在陆生植物的统治地位,被认为与被子植物高效的水分传导和光合速率有关。然而,早期被子植物的水力结构较为原始,光合速率也较低,局限分布于热带的林下。前人研究暗示,被子植物的崛起与水力结构的进化有关,有效的输水能力,保证了叶片的供水,从而使得被子植物有着较高的光合同化能力,相较蕨类和裸子植物具有高的竞争优势。  本研究以被子植物系统发育基部类群为主要研究对象,在开展实验进行测定的同时,结合收集数
学位
纳米氧化钛(TiO2)作为一种多功能材料,由于其独特的理化性质,包括小尺寸、大比表面积、低熔点、强稳定性、磁性、紫外线吸收能力等,已广泛应用于涂料、化妆品、食品工业、植入式医疗器械、药物制剂自清洁、光催化剂、光伏电池和传感器等多个领域。由于纳米TiO2的广泛应用,人体经常接触到纳米TiO2。纳米TiO2可以通过皮肤、消化道、肺吸入、医学植入等途径进入人体,并在各个器官内累积,对各个器官如肺、肝脏、
提高转子进口温度是改善燃气轮机性能和提高其经济性的重要途径,气膜冷却作为重要的冷却手段,在燃气涡轮中得到了最广泛的应用。为了设计一个能充分冷却涡轮高温叶片的系统,了解涡轮叶栅通道内流动的详细机理是很有必要的。本文以MunichArmedForcesUniversity的Ardey.S和Fottner.L的实验涡轮叶栅为研究对象,采用NUMECA公司的Fine/Turbo软件包对没有气膜冷却的涡轮叶
学位
纳米材料的广泛用途使其制备方法的研究越来越受重视。化学气相沉积法(CVD)是制备纳米粉体的一种很有效的方法,例如日常用的碳黑、钛白粉颜料(TiO2)等都可以用该方法制得。颗粒尺寸、尺寸分布状况以及形态等特性对颗粒产品的性能都产生极大的影响,这就要求对生产装置的结构和操作参数要有很好的了解和控制。本文应用CFD商业软件FLUENT,对火焰CVD法合成二氧化钛纳米颗粒的过程进行了详细的数值模拟。首先对
学位
高光谱成像是一种将成像技术和光谱技术成功结合的多维信息获取技术,同步探测目标地物的二维几何空间信息和一维光谱信息,获取具备分辨率高、光谱波段窄的影像数据。高光谱影像数据集地物样本的图像信息和光谱信息于一身,图像信息可反映地物目标的规模、分布、形状等外部特征,而光谱信息可反映样本内部的物理结构、化学组成的差异,所以高光谱影像数据具备“图谱合一”的性质。而正是由于这一性质,使得高光谱遥感影像在环境监控
三维人脸感知是计算机视觉和模式识别领域的热门研究课题,受到了国内外的广泛关注和深入研究。其中三维人脸的稠密对应、重建和识别是三维人脸感知中紧密相连的研究热点,三维人脸稠密对应建立了三维人脸之间的点点对应关系,给三维人脸重建、识别等研究带来了方便;三维人脸重建从人脸图片中恢复三维人脸形状,在动画制作、人脸识别等领域有广泛的应用前景;三维人脸识别能解决二维人脸识别受姿态和光照等变化影响大的问题,使人脸
本论文基于自适应动态规划(Adaptive Dynamic Programming, ADP)方法,结合反演控制法,鲁棒控制理论,自适应控制理论,对各种约束条件下的导引控制一体化(Integrated Guidance and Control,IGC)的设计进行研究,综合考虑导弹制导控制性能的稳定性与最优性。首先,建立俯仰平面内的导引控制一体化模型,然后基于此模型,主要从三个方面深入研究与分析相应
学位
针对某小型高速无人机操纵性强、稳定性差导致高速段安全性不足的问题,确立了低速段采用角速率阻尼内回路,高速段采用角速率指令内回路的纵向控制策略,并设计了过渡模态实现控制律的切换,完成了从起飞、巡航、加速与高速飞行全过程的纵向控制,解决了无人机在高速段对纵向质心偏移敏感的问题。首先,通过刚体运动模型和气动数据插值的方式对无人机进行建模,并以CMEX模型库的方式导入Matlab进行配平线性化。经过对模型
图像融合是将同一场景的多幅图像合成一幅信息更全面且内容更丰富的新图像。它是多种传感器协同完成实际任务的有效途径,可明显改善单传感器成像的不足,提高系统的稳定性与可靠性。目前,在军事、视频监控、数码摄影,医疗诊断等诸多领域都具有很高的应用和研究价值。但是由于不同类型传感器采集图像的特殊性以及图像信息复杂性,图像融合技术还没有达到预期的效果。图像融合技术涉及两个关键环节,图像表示和融合策略的设计。在图
学位
互联网的快速发展给人们带来便利的同时,也带来了诸多问题,面对海量的信息,人们常常不知道如何选择。推荐系统是帮助用户快速发现有用信息的工具,是一种为用户“量身定制”的个性化系统。它可以根据用户的偏好需求模型来进行项目推荐,在这个过程中,与用户偏好需求越匹配的项目则越倾向于推荐给用户。  协同过滤算法是最经典且最成功的推荐算法之一。传统的协同过滤推荐算法的相似度量方法忽略了用户间行为一致性的问题,导致