• 国家药监局综合司 国家卫生健康委办公厅
  • 国家药监局综合司 国家卫生健康委办公厅

Joint extraction of Chinese EMR entity relationship based on bert

Corresponding author: huangxiaofang, 448401501@qq.com
DOI: 10.12201/bmr.202206.00003
Statement: This article is a preprint and has not been peer-reviewed. It reports new research that has yet to be evaluated and so should not be used to guide clinical practice.
  •  

    Abstract: Electronic medical record is some clinical information of patients generated by medical staff in the medical process, including a large number of medical entities related to patients health. How to extract medical information efficiently from unstructured medical record text has become a research hotspot in the field of natural language processing (NLP). At present, the joint entity relationship extraction model mainly identifies entities and then extracts relationships for classification. However, this method will be affected by redundant entities, and can not well capture the internal relationship between entities and relationships. In order to solve these problems, this paper uses a cascade decoder for relationship extraction, First, the head entity is identified by the head entity identification module, and then the tail entity is identified for different relationships by the relationship specific tail entity annotation module. In addition, the characteristics of EMR entities are mainly the high-density distribution of entities and the cross interconnection of relationships between entities. In view of this characteristic, this paper uses the pointer annotation method to solve the problem of entity nesting in EMR documents, and improves the tail entity relationship specific annotator module to solve the problem of cross interconnection of relationships between entities. The comparative experiment selects two mainstream models as the baseline and successively verifies them in the chip2020 data set. The F value of this method has increased by 3 percentage points. Experiments show that the proposed method is very effective for relationship extraction.

    Key words: natural language process; Chinese EMR; relation extraction; joint extraction model

    Submit time: 1 June 2022

    Copyright: The copyright holder for this preprint is the author/funder, who has granted biomedRxiv a license to display the preprint in perpetuity.
  • 图表

  • lizihao, Chen Mosha, Ma Zhenxin, Yin Kangping, Tong Yixuan, Tan Chuanqi, Lang ZhenZhen, Tang Buzhou. CMedCausal - A dataset of Chinese medical causal relationship extraction. 2022. doi: 10.12201/bmr.202211.00004

    pangzhen, GuJiYu, WuYuFei, YanSshiXing, LiWangYang, SunYue. A study on the solution of the problem of extracting essential substance of TCM diagnosis and treatment of hypertension based on triple extraction strategy. 2021. doi: 10.12201/bmr.202107.00015

    Liu Zhongyu, Yao Jia, Yu Siwei, Zheng Ziqiang, Lan Lan, Yin Jin. Research on Analysis and Countermeasures of Medical Disputes Based on Knowledge Extraction. 2021. doi: 10.12201/bmr.202110.00022

    wuxuehong. A method of recognizing entities from Chinese Electronic Medical Record based on domain word vector combined with word attributes reasoning. 2021. doi: 10.12201/bmr.202109.00016

    Li Wenfeng, 朱威, 王晓玲. Text2DT: Decision rule extraction technology for clinical medical texts. 2022. doi: 10.12201/bmr.202211.00002

    You Liping, WangShiyu. Extraction of Adverse Drug Events from Social Media Based on FrameNet Semantic Analysis YOU Liping, WANG Shiyu, LI Chaofan, College of Economics and Management, Shanxi University, Taiyuan 030006, China.. 2022. doi: 10.12201/bmr.202211.00006

    Xiang Fei. Research on the influencing factors of nurses protection of patients privacy in EMR. 2020. doi: 10.12201/bmr.202009.00013

    Guan Zhihao, Shan Zhiyi, Lin Ziluo, yangxuemei, Tang Xiaoli. Discovery of potential comorbidity relationship based on co-occurrence and citation of entities. 2022. doi: 10.12201/bmr.202203.00003

    kangyishuai, shaochenjie. An Algorithm for Generating TCM Document Questions Based on Unified Language Model. 2022. doi: 10.12201/bmr.202110.00044

    xie jia qi. Leveraging Pre-trained Language Model for Consumer Health Question Classification. 2021. doi: 10.12201/bmr.202101.00017

  • ID Submit time Number Download
    1 2022-01-05

    bmr.202206.00003V1

    Download
  • Public  Anonymous  To author only

Get Citation

chenjianqiu, huangxiaofang. Joint extraction of Chinese EMR entity relationship based on bert. 2022. biomedRxiv.202206.00003

Article Metrics

  • Read: 887
  • Download: 12
  • Comment: 0

Email This Article

User name:
Email:*请输入正确邮箱
Code:*验证码错误