文志华, 张龙信, 刘青萍, 肖莉, 夏帅帅, 晏峻峰. 服务于中医智能诊疗的经方医案数据采集、处理方法与工具构建. 2025. biomedRxiv.202501.00065
服务于中医智能诊疗的经方医案数据采集、处理方法与工具构建
通讯作者: 文志华, wenzhihua137@qq.com
DOI:10.12201/bmr.202501.00065
Method and Tool Construction for Data Collection and Processing of Classical Formula Medical Cases Serving Intelligent Diagnosis and Treatment
Corresponding author: wenzhihua, wenzhihua137@qq.com
-
摘要:目的/意义基于人工智能的中医诊疗模型、系统研究正在高速发展,中医医案数据采集与处理是开展智能诊疗模型研究的重要基础,高质量的医案数据是制约当前中医智能模型研究的关键基础因素。方法/过程本文构建中医经方医案数据采集与处理总体框架,提出网页、纸质文献、电子病历、已有数据库的经方医案数据采集方法及医案数据清洗、相似度检测方法,给出对应算法及数据处理结果。方法/过程最终获取有效经方医案数据12499条,为中医智能诊疗模型训练下游任务提供了基础数据。本文提出的数据采集、处理方法为中医药领域大数据处理提供了可借鉴思路。
Abstract: Purpose/Significance Chinese medicine intelligent diagnosis and treatment model and system based on artificial intelligence are developing rapidly. Chinese medicine case data collection and processing oriented to intelligent diagnosis and treatment is an important basis for intelligent model research. However, high-quality basic medical record data has always been an important factor restricting the development of TCM intelligent model. Method/Process In this study, we put forward the general framework for data collection and processing of classical formula medical records of TCM, including collection methods of web pages, paper documents, electronic medical records and existing databases, as well as pre-processing processes including cleaning, similarity detection, manual proofreading, etc. Result/Conclusion We propose the algorithms corresponding to important processes and the results of acquisition and processing. We finally obtained 12,499 pieces of effective prescription medical data, laying a solid data foundation for downstream intelligent diagnosis and treatment model training. At the same time, this research work provides a useful method and idea for data collection and processing in the field of traditional Chinese medicine.
Key words: Medical case data collection; Medical case data processing; Classical formula medical case data set; Artificial intelligence; Traditional Chinese medicine提交时间:2025-01-22
版权声明:作者本人独立拥有该论文的版权,预印本系统仅拥有论文的永久保存权利。任何人未经允许不得重复使用。 -
图表
-
车贺宾, 徐洪丽. 构建医学大数据应用实践中临床数据处理流程规范. 2021. doi: 10.12201/bmr.202109.00002
胡铁骊, 周博翔, 于浩. 中医药数据资源目录体系的构建与研究. 2023. doi: 10.12201/bmr.202312.00025
吕艳华, 王康龙, 钟小云, 陈俊冶. 基于文本挖掘的互联网医疗平台用户画像模型构建 ——以自闭症疾病问诊数据为例. 2023. doi: 10.12201/bmr.202312.00012
邓丽华, 刘晓鹰, 徐克菲. 贝利尤单抗联合中药辨证方治疗狼疮性肾炎患儿1例. 2024. doi: 10.12201/bmr.202409.00041
刘彬, 肖晓霞, 邹北骥, 周展, 郑立瑞, 谭建聪. 融合汉字部首的BERT-BiLSTM-CRF中医医案命名实体识别模型. 2023. doi: 10.12201/bmr.202303.00004
郭维嘉. 中文电子病历数据元抽取方法. 2024. doi: 10.12201/bmr.202404.00038
张一颖, 杜昱, 李萌, 李园白, 杨硕, 刘方舟, 王静, 李逸豪, 杨阳. 基于医案大数据分析系统探索冠心病古方用药特点. 2021. doi: 10.12201/bmr.202110.00027
朱彦, 徐静雯. 中医药本体研究思考与展望--从术语集到本体集. 2022. doi: 10.12201/bmr.202206.00002
郭哲韬, 石文丽, 杨涛. 基于语音识别的中医临床诊疗智能语音助理的设计与实现. 2021. doi: 10.12201/bmr.202110.00013
高景宏, 李明原, 王琳, 翟运开, 赵杰. 健康医疗大数据在精准医疗领域的应用与面临挑战. 2021. doi: 10.12201/bmr.202106.00014
-
序号 提交日期 编号 操作 1 2024-12-03 bmr.202501.00065V1
下载 -
-
公开评论 匿名评论 仅发给作者
引用格式
访问统计
- 阅读量:3
- 下载量: 0
- 评论数:0