Department of Chemical Engineering, Room 66-552, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
Biotechnol Bioeng. 1997 Mar 5;53(5):443-52. doi: 10.1002/(SICI)1097-0290(19970305)53:5<443::AID-BIT1>3.0.CO;2-H.
A large volume of data is routinely collected during the course of typical fermentation and other processes. Such data provide the required basis for process documentation and occasionally are also used for process analysis and improvement. The information density of these data is often low, and automatic condensing, analysis, and interpretation ("database mining") are highly desirable. In this article we present a methodology whereby process variables are processed to create a database of derivative process quantities representative of the global patterns, intermediate trends, and local characteristics of the process. A powerful search algorithm subsequently attempts to extract the specific process variables and their particular attributes that uniquely characterize a class of process outcomes such as high- or low-yield fermentations.The basic components of our pattern recognition methodology are described along with applications to the analysis of two sets of data from industrial fermentations. Results indicate that truly discriminating variables do exist in typical fermentation data and they can be useful in identifying the causes or symptoms of different process outcomes. The methodology has been implemented in a user-friendly software, named db-miner, which facilitates the application of the methodology for efficient and speedy analysis of fermentation process data. (c) 1997 John Wiley & Sons, Inc. Biotechnol Bioeng 53: 443-452, 1997.
大量的数据通常在典型的发酵过程和其他过程中被收集。这些数据为过程文档提供了必要的基础,偶尔也用于过程分析和改进。这些数据的信息密度通常较低,因此非常需要自动压缩、分析和解释(“数据库挖掘”)。在本文中,我们提出了一种方法,通过该方法处理过程变量,创建一个代表过程全局模式、中间趋势和局部特征的衍生过程量的数据库。随后,一个强大的搜索算法试图提取能够唯一表征一类过程结果(如高产或低产发酵)的特定过程变量及其特定属性。我们还描述了模式识别方法的基本组成部分,并将其应用于来自工业发酵的两组数据的分析。结果表明,在典型的发酵数据中确实存在真正有区别的变量,它们可用于识别不同过程结果的原因或症状。该方法已在一个名为 db-miner 的用户友好型软件中实现,该软件可方便地应用于发酵过程数据的高效快速分析。(c)1997 年 John Wiley & Sons, Inc. 《生物工程学报》53: 443-452, 1997.