• 国家药监局综合司 国家卫生健康委办公厅
  • 国家药监局综合司 国家卫生健康委办公厅

Evaluating Data Mining Algorithms for Consumer Health-Related Question Classification

Corresponding author: Li Jiao, li.jiao@imicams.ac.cn
DOI: 10.12201/bmr.202101.00018
Statement: This article is a preprint and has not been peer-reviewed. It reports new research that has yet to be evaluated and so should not be used to guide clinical practice.
  •  

    Abstract: To recognize better data mining algorithms for intelligent medical information systems, it is crucial to assess the algorithms’ performance in solving a specific medical problem. This study set up an algorithm evaluation task for consumer health-related question classification, which it is an important scenario of consumer health question understanding in Internet-based healthcare service. This study constructed a corpus with 8000 health-related questions, using random sampling method and cross validation method. for data collecting and annotation. Correspondingly, evaluation metric was set up for the question classification task. This evaluation task attracted 396 teams from both research and industry communities , and 149 of them submitted their algorithms for online evaluation. The best performance of submitted algorithms achieved macro-F1 score of 0.755 on the independent test set. All the evaluation resources are open accessible, including the corpus, evaluation metrics and scripts.

    Key words: Consumer; Health Question; Classification Algorithm; Evaluation

    Submit time: 27 May 2021

    Copyright: The copyright holder for this preprint is the author/funder, who has granted biomedRxiv a license to display the preprint in perpetuity.
  • 图表

  • xie jia qi. Leveraging Pre-trained Language Model for Consumer Health Question Classification. 2021. doi: 10.12201/bmr.202101.00017

    Guo Mengying, Zhou Yi, He Jingshu, Pan Jiaxin, Sun JingKai, Huang Wei. Research on Function Classification of WeChat Official Account Service Platform of Traditional Chinese Medicine Hospital Based on Card Classification. 2020. doi: 10.12201/bmr.202010.00833

    jia lirong. research on question understanding about the automatic question answering system of TCM. 2021. doi: 10.12201/bmr.202101.00002

    caohaixia, 刘宇薇. Comparative Study on Public Health Information Services of Libraries at Home and Abroad. 2020. doi: 10.12201/bmr.201908.00001

    chenying, deng panpan, 李军莲. Comparative study on foreign well-known public health websites. 2021. doi: 10.12201/bmr.202109.00023

    Gu Yao-wen, Li Jiao. Progress of Mining Electronic Health Records based on Unsupervised Deep Learning Methods. 2021. doi: 10.12201/bmr.202104.00013

    刘, Yan Zhu, Zongyou Li, Dongfei Lin, Lihong Liu, Dongyun Shi. A study on Diseases Classification and Modelof the SNOMED CT. 2021. doi: 10.12201/bmr.202110.00005

    ruanxuling, liuqi, guo zhiheng, yanjunfeng. Research on prediction model of breast cancer based on LDA and XGBoost algorithm. 2022. doi: 10.12201/bmr.202106.00007

    MU Jun, XIAO xiaoxia, LIU Qingping. PBL and Capability-oriented Exploration of Teaching Computational Thinking and Algorithm Design. 2021. doi: 10.12201/bmr.202108.00015

    kangyishuai, shaochenjie. An Algorithm for Generating TCM Document Questions Based on Unified Language Model. 2022. doi: 10.12201/bmr.202110.00044

  • ID Submit time Number Download
    1 2021-01-18

    bmr.202101.00018V1

    Download
  • Public  Anonymous  To author only

Get Citation

Xu Xiaowei, Guo Haihong, Li Jiao. Evaluating Data Mining Algorithms for Consumer Health-Related Question Classification. 2021. biomedRxiv.202101.00018

Article Metrics

  • Read: 820
  • Download: 2
  • Comment: 0

Email This Article

User name:
Email:*请输入正确邮箱
Code:*验证码错误