文档介绍:多环芳烃若干环化指标与分子几何参数的关系
陈念贻收稿日期:2002-06-10;修回日期:2002-09-10
资金资助:国家自然科学基金委和美国福特公司联合资助,批准号:9716214
作者简介:陈念贻,(1931—),男,教授,研究方向:计算机化学
, 陆文聪1, 刘旭1, 叶晨洲2,李国正2
( ,上海,200436 ,上海,200030)
摘要支持向量机(SVM)算法是特别适合于用有限已知样本训练建模,进而预报未知样本属性的模式识别新算法。本工作中应用支持向量回归算法和多环芳烃分子的环数、分子宽度、长度、体积、顶联接指数和边联接指数等几何参数作数据挖掘,总结了多环芳烃在空气-正辛醇分配比、多环芳烃在土壤中吸附参数、多环芳烃的生物浓度因子与分子几何参数关系的数学模型。用留一法证明:数学模型的预报可靠性较PLS算法建立的数学模型略优。
关键词多环芳烃, 环化指标,支持向量回归,数学模型
中图分类号:O 06-04
On the Relationships between Geometrical Parameters of Polycyclic Aromatic Hydrocarbons and Their Environmental Properties
Chen Nianyi1, Lu Wencong1, Liu Xu1, Ye Chenzhou2, Li Guozheng2
( of Chemical Data Mining, Department of Chemistry,
Shanghai University, Shanghai, 200436, China
of Image and Pattern Recognition, JiaoTong University,
Shanghai, 200030, China)
Abstract Support vector machine proposed by Vapnik is a newly developed technique for data mining. It is suitable for the data processing based on finite number of training samples, with special technique to restrict overfitting. In this work, support vector regression has been used for correlating and modeling the relationships between the geometrical parameters and environmental behaviors of polycyclic aromatic hydrocarbons. The prediction ability of the mathematical models obtained is somewhat better than that obtained by PLS regre