Co-reporter:Jiaxuan Wei, Ruisheng Zhang, Zhixuan Yu, Rongjing Hu, Jianxin Tang, Chun Gui, Yongna Yuan
Applied Soft Computing 2017 Volume 58(Volume 58) pp:
Publication Date(Web):1 September 2017
DOI:10.1016/j.asoc.2017.04.061
•We modify the BPSO for wrapper feature selection with two mechanisms.•The memory renewal mechanism has an effect on local and global optimum, helps the particle overstep the local extremum.•The mutation-enhanced mechanism increases the particle mutation probability to avoid premature convergence.•We examine our modified algorithm, compared with previous versions of BPSO and other algorithms.•The novel algorithm results show the better accuracy and fewer feature number.Feature selection (FS) is an essential component of data mining and machine learning. Most researchers devoted to get more effective method with high accuracy and fewer features, it has become one of the most challenging problems in FS. Certainly, some algorithms have been proven to be effectively, such as binary particle swarm optimization (BPSO), genetic algorithm (GA) and support vector machine (SVM). BPSO is a metaheuristic algorithm having been widely applied to various fields and applications successfully, including FS. As a wrapper method of FS, BPSO-SVM tends to be trapped into premature easily. In this paper, we present a novel mutation enhanced BPSO-SVM algorithm by adjusting the memory of local and global optimum (LGO) and increasing the particles’ mutation probability for feature selection to overcome convergence premature problem and achieve high quality features. Typical simulated experimental results carried out on Sonar, LSVT and DLBCL datasets indicated that the proposed algorithm improved the accuracy and decreased the number of feature subsets, comparing with existing modified BPSO algorithms and GA.Download high-res image (130KB)Download full-size image
Co-reporter:Peng Lu;Xia Wei;Yongna Yuan;Zhiguo Gong
Medicinal Chemistry Research 2011 Volume 20( Issue 8) pp:1220-1228
Publication Date(Web):2011 November
DOI:10.1007/s00044-010-9431-1
Support vector machine (SVM) was used to develop a nonlinear quantitative structure–activity relationship (QSAR) model for the prediction of the activities of the adenosine A2A receptor antagonists. Six molecular descriptors selected by the heuristic method (HM) in CODESSA were used as inputs for SVM. The results obtained by SVM were compared with those obtained by HM. The mean squared errors (MSEs) for the training set given by HM and SVM are 0.08 and 0.05, respectively, which shows the performance of SVM model is better than that of the HM model.
Co-reporter:Rongjing Hu;Florent Barbault;François Maurel;Michel Delamar
Chemical Biology & Drug Design 2010 Volume 76( Issue 6) pp:518-526
Publication Date(Web):
DOI:10.1111/j.1747-0285.2010.01028.x
Molecular dynamics (MD) simulations in water environment were carried out on the HIV-1 reverse transcriptase (RT), and its complexes with one representative of each of three series of inhibitors: 2-amino-6-arylsulphonylbenzonitriles and their thio and sulphinyl congeners. Molecular Mechanics Generalized Born Surface Area (MM-GBSA) was used to calculate the binding free energy based on the obtained MD trajectories. Calculated energies are correlated to activity. A comparison of interaction modes, binding free energy, contributions of the residues to the binding free energy and H-bonds was carried out with the average structures. The results show that there exist different interaction modes between RT and ligands and different specific interactions with some residues. The higher binding affinity of the most potent inhibitor in the series of molecules under study is favoured by electrostatic interactions and solvation contribution.
Co-reporter:Peng Lu;Yongna Yuan;Zhiguo Gong
Journal of Chemometrics 2010 Volume 24( Issue 9) pp:565-573
Publication Date(Web):
DOI:10.1002/cem.1314
Abstract
A quantitative structure activity relationship (QSAR) analysis was performed on the values of a series of fatty acid amide hydrolase (FAAH) inhibitors. Six molecular descriptors selected by CODESSA software were used as inputs to perform heuristic method (HM) and support vector machine (SVM). The results obtained by SVM were compared with those obtained by the HM. The root mean square errors (RMSEs) for the training set given by HM and SVM were 0.555 and 0.404, respectively, which shows that the performance of the SVM model is better than that of the HM model. This paper provides a new and effective method for predicting the activity of FAAH inhibitors. Copyright © 2010 John Wiley & Sons, Ltd.
Co-reporter:Rongjing Hu, Jean-Pierre Doucet, Michel Delamar, Ruisheng Zhang
European Journal of Medicinal Chemistry 2009 Volume 44(Issue 5) pp:2158-2171
Publication Date(Web):May 2009
DOI:10.1016/j.ejmech.2008.10.021
A quantitative structure–activity relationship study of a series of HIV-1 reverse transcriptase inhibitors (2-amino-6-arylsulfonylbenzonitriles and their thio and sulfinyl congeners) was performed. Topological and geometrical, as well as quantum mechanical energy-related and charge distribution-related descriptors generated from CODESSA, were selected to describe the molecules. Principal component analysis (PCA) was used to select the training set. Six techniques: multiple linear regression (MLR), multivariate adaptive regression splines (MARS), radial basis function neural networks (RBFNN), general regression neural networks (GRNN), projection pursuit regression (PPR) and support vector machine (SVM) were used to establish QSAR models for two data sets: anti-HIV-1 activity and HIV-1 reverse transcriptase binding affinity. Results showed that PPR and SVM models provided powerful capacity of prediction.A QSAR study of a series of HIV-1 NNRTIs was performed based on six different methods: MLR, MARS, RBFNN, GRNN, PPR and SVM. PPR and SVM yielded the best models.
Co-reporter:Yongna Yuan, Ruisheng Zhang, Rongjing Hu, Xiaofang Ruan
European Journal of Medicinal Chemistry 2009 Volume 44(Issue 1) pp:25-34
Publication Date(Web):January 2009
DOI:10.1016/j.ejmech.2008.03.004
Quantitative structure–activity relationship (QSAR) models were developed to predict for CCR5 binding affinity of substituted 1-(3,3-diphenylpropyl)-piperidinyl amides and ureas using linear free energy relationship (LFER). Eight molecular descriptors selected by the heuristic method (HM) in CODESSA were used as inputs to perform multiple linear regression (MLR), support vector machine (SVM) and projection pursuit regression (PPR) studies. Compared with MLR model, the SVM and PPR models give better results with the predicted correlation coefficient (R2) of 0.867 and 0.834 and the squared standard error (s2) of 0.095 and 0.119 for the training set and R2R2 of 0.732 and 0.726 and s2s2 of 0.210 and 0.207 for the test set, respectively. It indicates that the SVM and PPR approaches are more adapted to the set of molecules we studied. In addition, methods used in this paper are simple, practical and effective for chemists to predict the human CCR5 chemokine receptor.Definition of substituted 1-(3,3-diphenylpropyl)-piperidinyl amides and ureas. R1-position (showing limited substitution pattern), X position (showing limited structural variations), and the R2, R3, R4 positions of the phenyl rings (showing diverse substitution pattern).
Co-reporter:Rongjing Hu, Florent Barbault, Michel Delamar, Ruisheng Zhang
Bioorganic & Medicinal Chemistry 2009 Volume 17(Issue 6) pp:2400-2409
Publication Date(Web):15 March 2009
DOI:10.1016/j.bmc.2009.02.003
Molecular modeling of a series of HIV reverse transcriptase (RT) non-nucleoside inhibitors (2-amino-6-arylsulfonylbenzonitriles and their thio and sulfinyl congeners) was carried out by comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) approaches. Docking simulations were employed to position the inhibitors into RT active site to determine the most probable binding mode and most reliable conformations. The study was conducted using a complex receptor-based and ligand-based alignment procedure and different alignment modes were studied to obtain highly reliable and predictive CoMFA and CoMSIA models with cross-validated q2 value of 0.723 and 0.760, respectively. Furthermore, the CoMFA and CoMSIA contour maps with the 3D structure of the target (the binding site of RT) inlaid were obtained to better understand the interaction between the RT protein and the inhibitors and the structural requirements for inhibitory activity against HIV-1. We show that for 2-amino-6-arylsulfonylbenzonitriles inhibitors to have appreciable inhibitory activity, bulky and hydrophobic groups in 3- and 5-position of the B ring are required. Moreover, H-bond donor groups in 2-position of the A ring to build up H-bonding with the Lys101 residue of the RT protein are also favorable to activity.A CoMSIA model with high predictive ability is developed. Favorable positions for bulky, hydrophobic and H-bond donor or acceptor substituents are discussed.
Co-reporter:Yongna Yuan, Ruisheng Zhang, Liangying Luo
Chemometrics and Intelligent Laboratory Systems 2009 Volume 96(Issue 2) pp:144-148
Publication Date(Web):15 April 2009
DOI:10.1016/j.chemolab.2009.01.004
The least-squares support vector machine (LS-SVM), as an effective machine learning algorithm, was used to develop a nonlinear binary classification model of novel piperazines-bis- piperazines as antagonists for the melanocortin-4 (MC4) receptor based on their activity. Each compound was represented by calculated structural descriptors that encode constitutional, topological, geometrical, electrostatic, quantum-chemical features. Five descriptors selected by forward stepwise linear discriminant analysis (LDA) were used as inputs of the LS-SVM model. The nonlinear model developed from LS-SVM algorithm (with prediction accuracy of 95% on the test set) outperformed LDA (test accuracy of 90%). The proposed method is very useful for chemists to screen antagonists for the MC4 receptor.