• 国家药监局综合司 国家卫生健康委办公厅
  • 国家药监局综合司 国家卫生健康委办公厅

Research on named entity recognition of Chinese medical records based on BERT-BiLSTM-CRF with Chinese radicals

Corresponding author: xiaoxiaoxia, amily_x@hnucm.edu.cn
DOI: 10.12201/bmr.202303.00004
Statement: This article is a preprint and has not been peer-reviewed. It reports new research that has yet to be evaluated and so should not be used to guide clinical practice.
  •  

    Abstract: Abstract Purpose Research on the method of extracting medical terms from chinese medical records, realize the automatic structure of medical records, and provide structured data for knowledge discovery of medical records. Method This paper proposes a deep learning named entity recognition model based on BERT combining LSTM, CRF and radical features. This model embeds chinese radicals in BERT word vector, extracts entity features with BiLSTM, and uses CRF for sequence prediction. 400 medical cases with a total of more than 50000 words manually marked are divided into training set and test set according to the ratio of 3 to 1, the model is used to identify four types of named entities in chinese medical records: body, medicine, symptom, and disease. Result The F1 value of this model on the test set is 84.81%, which is superior to other models without embedded radicals, indicating that the model can more effectively identify named entities in chinese medical records and better structured medical records.

    Key words: Entity recognition; Radical features; BERT; BiLSTM; CRF

    Submit time: 22 March 2023

    Copyright: The copyright holder for this preprint is the author/funder, who has granted biomedRxiv a license to display the preprint in perpetuity.
  • 图表

  • chenjieqing, zhangfeng. Named Entity Recognition in Chinese Electronic Medical Records Using Knowledge Graph Construction. 2023. doi: 10.12201/bmr.202312.00011

    Deng Jiale, Hu Zhensheng, Lian Wanmin, Hua Yunpeng, Zhou Yi. Research on entity recognition of liver cancer electronic medical records based on RoBERTa-CRF. 2023. doi: 10.12201/bmr.202303.00027

    shenrongrong, xiashuaishuai, yanjunfeng. Review on Research of Named Entity Recognition in Chinese Medicine. 2022. doi: 10.12201/bmr.202207.00038

    HU Haiyang, ZHAO Congpu, Ma Lian, JIANG Huizhen, ZHANG Jing, ZHU Weiguo. Attention Mechanism And Dilated Convolution Neural Networks for Named Entity Recognition. 2021. doi: 10.12201/bmr.202102.00004

    chenjianqiu, huangxiaofang. Joint extraction of Chinese EMR entity relationship based on bert. 2022. doi: 10.12201/bmr.202206.00003

    wuxuehong. A method of recognizing entities from Chinese Electronic Medical Record based on domain word vector combined with word attributes reasoning. 2021. doi: 10.12201/bmr.202109.00016

    renhuiling, lixiaoying, wangweijie, wangxu, zhangying. Research on Chinese electronic medical record entity mapping method by fusing similarity algorithm and pre-trained model. 2023. doi: 10.12201/bmr.202305.00015

    GUO ZheTao, SHI Wenli, YANG Tao. Design and implementation of intelligent speech assistant for clinical diagnosis and treatment of traditional Chinese medicine based on speech recognition. 2021. doi: 10.12201/bmr.202110.00013

    pangzhen, GuJiYu, WuYuFei, YanSshiXing, LiWangYang, SunYue. A study on the solution of the problem of extracting essential substance of TCM diagnosis and treatment of hypertension based on triple extraction strategy. 2021. doi: 10.12201/bmr.202107.00015

    zhaocongpu, YUAN Da, ZHU Pu-jue, ZHOU Jiong, CHEN Zheng, PENG Hua. Research and practice on intelligent classification of medical safety incidents based on deep BERT. 2023. doi: 10.12201/bmr.202312.00021

  • ID Submit time Number Download
    1 2022-10-11

    bmr.202303.00004V1

    Download
  • Public  Anonymous  To author only

Get Citation

xiaoxiaoxia. Research on named entity recognition of Chinese medical records based on BERT-BiLSTM-CRF with Chinese radicals. 2023. biomedRxiv.202303.00004

Article Metrics

  • Read: 446
  • Download: 3
  • Comment: 0

Email This Article

User name:
Email:*请输入正确邮箱
Code:*验证码错误