Tuesday, October 22, 2013

Fast Tanimoto Similarity Calculation using rcdk

Well many of you ,who are using r for cheminformatics must be knowing rcdk . Regarding the tanimoto calculation i have seen it seem it takes a long time to calculate the code in rcdk code looks neat but still the similarity calculation can be performed much faster using the inner products.Below given a simple code to do that and also the time taken is like 10 times faster than the rcdk code. Quite an impressive performance boost . I have made a pull request to Rajarshi's code, it should be available soon in the main package.

##Consider m is the binary matrix of 0 and 1 which you calculated using fp.sim.matrix###

Time taken for the new method
user  system elapsed 
  2.962   0.012   2.971 

#Normal method in rcdk

user  system elapsed 
 43.644   0.064  43.707 

Post a Comment