Karystianis George, Sheppard Therese, Dixon William G, Nenadic Goran
School of Computer Science, University of Manchester, Manchester, UK.
The Christie NHS Foundation Trust, Manchester, UK.
BMC Med Inform Decis Mak. 2016 Feb 9;16:18. doi: 10.1186/s12911-016-0255-x.
Free-text medication prescriptions contain detailed instruction information that is key when preparing drug data for analysis. The objective of this study was to develop a novel model and automated text-mining method to extract detailed structured medication information from free-text prescriptions and explore their variability (e.g. optional dosages) in primary care research databases.
We introduce a prescription model that provides minimum and maximum values for dose number, frequency and interval, allowing modelling variability and flexibility within a drug prescription. We developed a text mining system that relies on rules to extract such structured information from prescription free-text dosage instructions. The system was applied to medication prescriptions from an anonymised primary care electronic record database (Clinical Practice Research Datalink, CPRD).
We have evaluated our approach on a test set of 220 CPRD prescription free-text directions. The system achieved an overall accuracy of 91 % at the prescription level, with 97 % accuracy across the attribute levels. We then further analysed over 56,000 most common free text prescriptions from CPRD records and found that 1 in 4 has inherent variability, i.e. a choice in taking medication specified by different minimum and maximum doses, duration or frequency.
Our approach provides an accurate, automated way of coding prescription free text information, including information about flexibility and variability within a prescription. The method allows the researcher to decide how best to prepare the prescription data for drug efficacy and safety analyses in any given setting, and test various scenarios and their impact.
自由文本药物处方包含详细的用药指导信息,这在准备用于分析的药物数据时至关重要。本研究的目的是开发一种新颖的模型和自动化文本挖掘方法,以从自由文本处方中提取详细的结构化药物信息,并在初级保健研究数据库中探索其变异性(例如可选剂量)。
我们引入了一种处方模型,该模型为剂量数量、用药频率和用药间隔提供最小值和最大值,从而能够对药物处方中的变异性和灵活性进行建模。我们开发了一个文本挖掘系统,该系统依靠规则从处方自由文本剂量说明中提取此类结构化信息。该系统应用于来自匿名初级保健电子记录数据库(临床实践研究数据链,CPRD)的药物处方。
我们在一组220条CPRD处方自由文本说明的测试集上评估了我们的方法。该系统在处方层面的总体准确率为91%,在各属性层面的准确率为97%。然后,我们进一步分析了CPRD记录中超过56000条最常见的自由文本处方,发现四分之一的处方具有内在变异性,即在服用药物方面存在不同的最小和最大剂量、持续时间或频率的选择。
我们的方法提供了一种准确、自动化的方式来编码处方自由文本信息,包括处方内灵活性和变异性的信息。该方法使研究人员能够决定在任何给定环境下如何最好地准备用于药物疗效和安全性分析的处方数据,并测试各种情况及其影响。