This blog is for cheminformaticians and Chemogenomics enthusiast.
Friday, January 27, 2012
My Face book network
Some important properties:
Average Weighted degree:22.44
Network diameter =10
Average Path length:3.87
Average Clustering coefficient:0.553
M
modularity: 0.697
Radius: 0
Average Path length: 3.8770262057454894
Number of shortest paths: 220486
Total triangles: 39886
The Average Clustering Coefficient is the mean value of individual coefficients.
modularity: 0.697
Degree Report
Results:
Average Degree: 22.444Graph Distance Report
Parameters:
Network Interpretation: undirectedResults:
Diameter: 10Radius: 0
Average Path length: 3.8770262057454894
Number of shortest paths: 220486
Clustering Coefficient Metric Report
Parameters:
Network Interpretation: undirectedResults:
Average Clustering Coefficient: 0.553Total triangles: 39886
The Average Clustering Coefficient is the mean value of individual coefficients.
Thursday, January 26, 2012
Tuesday, January 24, 2012
An introductory tutorial of Qplot covers the requires the ggplot2 package
Q plot tutorial
View more documents from Abhik Seal.
Monday, January 23, 2012
Here it is the electricity data for the 140 countries. It also contains cost and procedures taken for those countries. After studying it found africa and some middle east countries the UN should look at.
In todays modern world we don't want that people getting electricity connection after 1 year when they applied.
pknB Inhibitors for MTB
To start with i just used various screening strategies like the pharmacophore, vROCS and Ftrees algorithm to screen the kinase datasets obtained from Otava Chemical library, Kindock, our own internal dataset(chem2Bio2Rdf) and ZINC pharmer hits .I Used Omega2 to generate the conformers compounds and then use the screening softwares to screen the datasets. Then i ranked the compound based on the harmonic mean of the ranked compounds of the screen sets. .After then using pipeline pilot's different admet filters listed some compounds.
There are hits which were highly ranked in pharmacophore but didnt show any ranks in other methods.
Other methods also had some compounds which were not ranked on other methods they are given below.
Also compounds where predicted for the biological activity using the pass prediction.Figure below shows the pass map of the biological activity with the targets.
Structural diversity of biologically interesting datasets: a scaffold analysis approach
Journal of Cheminformatics 2011, 3:30 doi: 10.1186/1758-2946-3-30
Varun Khanna and Shoba Ranganathan
Journal of Cheminformatics 2011, 3:30 doi: 10.1186/1758-2946-3-30
Varun Khanna and Shoba Ranganathan
Review- Abhik Seal
Introduction and methods
This paper describes about the metabolites and the natural products (NPs) in drug design and designing of the compound lead libraries. The conception of the paper focuses on that the natural products and metabolites are recognized by one of the protein in the biosphere . Since the metabolites and Natural products are optimized by nature to bind one of the biological targets and it is likely that lead libraries designed with scaffolds and fragments of the NPs and metabolites will result in molecules with better ADMET properties.
For the study different datasets were considered i.e the Drugs( taken from drug bank and KEGG drugs),Metabolites(HMDB, HumanCYC, BiGG),Toxic( DSSTox, FDA Carcinogeneticity, ITER, Super Tox icity), Natural Products(ZINC NP database),Leads(BIONET, Maybridge),NCI and CHEMBL.From the compounds duplicates entries ,organic ions, metal ions are removed and also corrupted or missing structure are removed. After all the filtering process the data was clustered in Pipleline pilot “Clara” program using ECFP_4 or FCFP_4 fingerprints. Physicochemical analysis was done using clutering with respect to the Lipinski properties: molecular weight, the number of hydrogen bond acceptors, AlogP (a hydrophobicity measure) and the number of hydrogen bond donors and other descriptors such as the molecular polar surface area, Molecular solubility, number of rings, number of rotatable bonds. A scaffold analysis was also done and also the results are analysed.
Similarity Analysis
In this paper a fragment based approach has been taken in which compounds are broken down to fragments to low molecular weight drug like fragments such as the ring systems functional groups ,side chains, linkers and fingerprints.
From the diversity analysis report it is found that the CHEMBL dataset generated maximum number of fragments than the others and seems to be much more diverse. Whereas the metabolites produced least number of fragments which means that metabolite compounds are not much more diversed and they occupy limited chemical space. Other drug datasets were found moderately diverse. Tanimoto analysis were also done on the datasets using a different approach given in Equation below
.
Here xiAand xiB are the number of times the ith fragment occurs in A and B over the n elements of each finger print.
The FCFP fingerprint were generated and tanimoto measure was calculated among the various datasets. It was found that the drugs and toxic substances shows 0.91 similarity and on the drugs were least similar to the metabolites. The fragments found in the metabolites are least similar to the Natural products.
A search in the aromatic rings when done indicating that the 85% of the drugs have aromatics ring as scaffold and 97.4 % was found in the lead compounds.In the top five scaffolds that are analysed benzene is the most abundant in all of the systems followed by Pyridine ,steroids ,purines and imidazoles.
Out of the 296 non redundant scaffolds found in the metabolites 42% shared by the drugs and 23 % shared by leads, which indicatd optimization of structures to become more like metabolite.Also large part of the scaffolds of metabolites are present in the natural molecules i.e around 47% NCI(78%) and CHEMBL(73%)
With the above analysis keeping in mind it is possible to suggest that the natural compounds and metabolites are important molecules in the drug discovery as most the biological targets uses one of these compounds.The scaffold of the NPs and metabolites are important in designing lead libraries.
Introduction and methods
This paper describes about the metabolites and the natural products (NPs) in drug design and designing of the compound lead libraries. The conception of the paper focuses on that the natural products and metabolites are recognized by one of the protein in the biosphere . Since the metabolites and Natural products are optimized by nature to bind one of the biological targets and it is likely that lead libraries designed with scaffolds and fragments of the NPs and metabolites will result in molecules with better ADMET properties.
For the study different datasets were considered i.e the Drugs( taken from drug bank and KEGG drugs),Metabolites(HMDB, HumanCYC, BiGG),Toxic( DSSTox, FDA Carcinogeneticity, ITER, Super Tox icity), Natural Products(ZINC NP database),Leads(BIONET, Maybridge),NCI and CHEMBL.From the compounds duplicates entries ,organic ions, metal ions are removed and also corrupted or missing structure are removed. After all the filtering process the data was clustered in Pipleline pilot “Clara” program using ECFP_4 or FCFP_4 fingerprints. Physicochemical analysis was done using clutering with respect to the Lipinski properties: molecular weight, the number of hydrogen bond acceptors, AlogP (a hydrophobicity measure) and the number of hydrogen bond donors and other descriptors such as the molecular polar surface area, Molecular solubility, number of rings, number of rotatable bonds. A scaffold analysis was also done and also the results are analysed.
Similarity Analysis
In this paper a fragment based approach has been taken in which compounds are broken down to fragments to low molecular weight drug like fragments such as the ring systems functional groups ,side chains, linkers and fingerprints.
From the diversity analysis report it is found that the CHEMBL dataset generated maximum number of fragments than the others and seems to be much more diverse. Whereas the metabolites produced least number of fragments which means that metabolite compounds are not much more diversed and they occupy limited chemical space. Other drug datasets were found moderately diverse. Tanimoto analysis were also done on the datasets using a different approach given in Equation below
.
Here xiAand xiB are the number of times the ith fragment occurs in A and B over the n elements of each finger print.
The FCFP fingerprint were generated and tanimoto measure was calculated among the various datasets. It was found that the drugs and toxic substances shows 0.91 similarity and on the drugs were least similar to the metabolites. The fragments found in the metabolites are least similar to the Natural products.
Physicochemical properties Analysis
Lipinksi rule of 5 predicts the drugs bioavailability. In the clustered sets 25% of the drugs do not stick to Ro5 also 68% of metabolites lie outside the rules. But after removal of the lipids and the metabolite ratio reduced to 20%. Also around 26% of the toxic compounds fails the Ro5 and only 16% of the Natural products fails the Ro5 and lead molecules rate was 19%.It was also studied that metabolites showed higher solubility,higher molecular surface area,low molecular complexity compared to that of drugs.Scaffold Analysis
From the scaffold part its being observed that the drugs being having the maximum number of scaffolds (50%) followed by the toxic(42%) with lowest is the metabolites with (14%) .The high values indicate the diversity of the compouds in the chemical space.There was more than 70% singletons in the CHEMBL and the NCI dataset . Also in the datasets such as the natural products,metabolites and leads 64%,39% and 34% recurring scaffolds occur meaning that the compounds are concentrated in a certain area.A search in the aromatic rings when done indicating that the 85% of the drugs have aromatics ring as scaffold and 97.4 % was found in the lead compounds.In the top five scaffolds that are analysed benzene is the most abundant in all of the systems followed by Pyridine ,steroids ,purines and imidazoles.
Out of the 296 non redundant scaffolds found in the metabolites 42% shared by the drugs and 23 % shared by leads, which indicatd optimization of structures to become more like metabolite.Also large part of the scaffolds of metabolites are present in the natural molecules i.e around 47% NCI(78%) and CHEMBL(73%)
With the above analysis keeping in mind it is possible to suggest that the natural compounds and metabolites are important molecules in the drug discovery as most the biological targets uses one of these compounds.The scaffold of the NPs and metabolites are important in designing lead libraries.
Subscribe to:
Posts (Atom)