• 国家药监局综合司 国家卫生健康委办公厅
  • 国家药监局综合司 国家卫生健康委办公厅

基于循证医学和电子病历数据的通用医学知识图谱构建

通讯作者: 何昆仑, kunlunhe@plagh.org
DOI:10.12201/bmr.202409.00027
声明:预印本系统所发表的论文仅用于最新科研成果的交流与共享,未经同行评议,因此不建议直接应用于指导临床实践。

Construction of general medical knowledge graph based on evidence-based medicine and electronic medical record data

Corresponding author: hekunlun, kunlunhe@plagh.org
  • 摘要:目的/意义 构建涵盖循证医学知识和电子病历数据的通用医学知识图谱,支撑数据治理、临床辅助决策、治疗方案推荐等应用,借助真实世界专家经验,提升图谱应用效果。方法/过程 梳理多源异构数据情况,融合国内外知名知识图谱,进行图谱schema设计,并利用RoBERTa预训练模型做词嵌入从医学文献、网络文献、教科书、医学数据库和电子病历等数据中进行命名实体识别和关系抽取,采用基于规则的SWIQA框架和基于随机抽样的人工审核策略进行图谱质量评价。结果/结论 研究共确定了128个本体和1108种关系,并以三元组形式存储于数据库中,得到图谱的语义准确性为93.8%。研究构建的通用知识图谱不仅涵盖了循证医学知识,还包括了临床真实世界产生的专家经验,可为医学人工智能应用提供支撑。

    关键词: 循证医学电子病历医疗知识图谱

     

    Abstract: Purpose/Significance To construct a general medical knowledge graph covering evidence-based medical knowledge and electronic medical record (EMR) data to support the application of data governance, clinical assisted decision making, treatment plan recommendation and other applications, and improve the application effect of the graph with the help of real-world expert experience. Method/Process The multi-source heterogeneous data was sorted out, the well-known knowledge graphs at home and abroad were integrated, the schema of the graphs was designed, and the word embedding of RoBERTa pre-trained model was used to identify named entities and extract relationships from medical literature, network literature, textbooks, medical databases and electronic medical records. The rule-based SWIQA framework and the manual audit strategy based on random sampling were used to evaluate the quality of the map. Result/Conclusion A total of 128 ontologies and 1108 relationships were identified and stored in the database in the form of triples. The semantic accuracy of the atlas was 93.8%. The general knowledge graph constructed by the study not only covers evidence-based medical knowledge, but also includes expert experience generated in the clinical real world, which can provide support for medical AI applications.

    Key words: Evidence-based medicine, EMR, Medical knowledge graph; ; 

    提交时间:2024-09-14

    版权声明:作者本人独立拥有该论文的版权,预印本系统仅拥有论文的永久保存权利。任何人未经允许不得重复使用。
  • 图表

  • 朱彦, ZHENG Jie, 李晓瑛, 杨啸林, HE Yongqun. 基本形式化本体BFO及中文版介绍. 2020. doi: 10.12201/bmr.202007.00009

    孙月萍, 董良广. 面向医学教育的可循证医学图书图谱化表示探讨. 2020. doi: 10.12201/bmr.202011.00001

    孙月萍, 康宏宇, 侯丽. 面向医学教育的可循证医学图书图谱化表示探讨. 2021. doi: 10.12201/bmr.202110.00001

    周旭, 窦川川, 彭咏梅, 刘海舟, 王艳萍, 吴勇奇, 朱卫丰. 循证医学专业英语语料库建设研究. 2020. doi: 10.12201/bmr.202005.00004

    王华琼, 俞定国, 钱归平. 基于医学社交媒体数据的多模态知识图谱构建. 2022. doi: 10.12201/bmr.202209.00005

    陈婕卿, 竹志超, 张锋, 曾可, 姜会珍, 程振宁. 面向知识图谱构建的中文电子病历命名实体识别方法研究. 2023. doi: 10.12201/bmr.202312.00011

    胡红娟, 周阳, 匡泽民, 谭琳. 医学知识图谱应用研究进展. 2021. doi: 10.12201/bmr.202107.00012

    梁静, 文奕. 知识图谱在医学辅助诊断中的应用研究. 2022. doi: 10.12201/bmr.202109.00021

    赵佳奇, 王晓锋, 樊羽羽, 张 伟, 王慧璇, 李金山. 电子病历数据质量及对策研究. 2020. doi: 10.12201/bmr.202011.00008

  • 序号 提交日期 编号 操作
    1 2024-06-04

    bmr.202409.00027V1

    下载
  • 公开评论  匿名评论  仅发给作者

引用格式

吴欢, 何昆仑. 基于循证医学和电子病历数据的通用医学知识图谱构建. 2024. biomedRxiv.202409.00027

访问统计

  • 阅读量:67
  • 下载量:2
  • 评论数:0

Email This Article

User name:
Email:*请输入正确邮箱
Code:*验证码错误