• 国家药监局综合司 国家卫生健康委办公厅
  • 国家药监局综合司 国家卫生健康委办公厅

Research on Chinese electronic medical record entity mapping method by fusing similarity algorithm and pre-trained model

Corresponding author: renhuiling, ren.huiling@imicams.ac.cn
DOI: 10.12201/bmr.202305.00015
Statement: This article is a preprint and has not been peer-reviewed. It reports new research that has yet to be evaluated and so should not be used to guide clinical practice.
  •  

    Abstract: Purpose/SignificanceIn order to fully explore and utilize the physical resources of Chinese electronic medical records, the combination of algorithms and models suitable for large-scale entity mapping without manually constructing regular features is studied. Method/Process The self-annotated Chinese electronic medical record standard dataset is used to fuse the similarity algorithm and the pre-trained model, and the candidate entity generation and entity disambiguation stages of entity mapping are applied respectively, and the performance of different similarity algorithms and pre-trained models selected in this paper is compared and analyzed in this process. Results/Conclusion A method to improve the effect of drug-like entity mapping is proposed, and the combination of Jaccard similarity algorithm and Bert pre-training model is finally determined, which achieves more than 90% accuracy and 99% recall in the entity mapping task, which can efficiently realize the entity mapping task of massive Chinese electronic medical records.

    Key words: Entity mapping; Entity standardization; Similarity algorithm; Electronic medical records

    Submit time: 17 May 2023

    Copyright: The copyright holder for this preprint is the author/funder, who has granted biomedRxiv a license to display the preprint in perpetuity.
  • 图表

  • chenjieqing, zhangfeng. Named Entity Recognition in Chinese Electronic Medical Records Using Knowledge Graph Construction. 2023. doi: 10.12201/bmr.202312.00011

    Deng Jiale, Hu Zhensheng, Lian Wanmin, Hua Yunpeng, Zhou Yi. Research on entity recognition of liver cancer electronic medical records based on RoBERTa-CRF. 2023. doi: 10.12201/bmr.202303.00027

    wuxuehong. A method of recognizing entities from Chinese Electronic Medical Record based on domain word vector combined with word attributes reasoning. 2021. doi: 10.12201/bmr.202109.00016

    chenjianqiu, huangxiaofang. Joint extraction of Chinese EMR entity relationship based on bert. 2022. doi: 10.12201/bmr.202206.00003

    xiaoxiaoxia. Research on named entity recognition of Chinese medical records based on BERT-BiLSTM-CRF with Chinese radicals. 2023. doi: 10.12201/bmr.202303.00004

    xie jia qi. Leveraging Pre-trained Language Model for Consumer Health Question Classification. 2021. doi: 10.12201/bmr.202101.00017

    Deng Lan, Du Tongzhou. An Efficient, Secure and Multi-keyword Search Scheme on Encrypted Electronic Medical Records. 2021. doi: 10.12201/bmr.202105.00008

    HU Haiyang, ZHAO Congpu, Ma Lian, JIANG Huizhen, ZHANG Jing, ZHU Weiguo. Attention Mechanism And Dilated Convolution Neural Networks for Named Entity Recognition. 2021. doi: 10.12201/bmr.202102.00004

    shenrongrong, xiashuaishuai, yanjunfeng. Review on Research of Named Entity Recognition in Chinese Medicine. 2022. doi: 10.12201/bmr.202207.00038

    ZHAO Jia-Qi, WANG Xiao-Feng, FAN Yu-Yu, ZHANG Wei, WANG Hui-Xuan, LI Jin-Shan. Research on the Quality and Countermeasures of Electronic Medical Record Data. 2020. doi: 10.12201/bmr.202011.00008

  • ID Submit time Number Download
    1 2023-02-20

    bmr.202305.00015V1

    Download
  • Public  Anonymous  To author only

Get Citation

renhuiling, lixiaoying, wangweijie, wangxu, zhangying. Research on Chinese electronic medical record entity mapping method by fusing similarity algorithm and pre-trained model. 2023. biomedRxiv.202305.00015

Article Metrics

  • Read: 397
  • Download: 1
  • Comment: 0

Email This Article

User name:
Email:*请输入正确邮箱
Code:*验证码错误