In this work, we calculated the pair wise chemical similarity for a subset of small molecules screened against the NCI60 cancer cell line panel. Four different compound similarity calculation methods were used: Brutus, GRIND, Daylight and UNITY. The chemical similarity scores of each method were related to the biological similarity data set. The same was done also for combinations of methods. In the end, we had an estimate of biological similarity for a given chemical similarity score or combinations thereof. The data from above was used to identify chemical similarity ranges where combining two or more methods (data fusion) led to synergy. The results were also applied in ligand-based virtual screening using the DUD data set. In respect to their ability to enrich biologically similar compound pairs, the ranking of the four methods in descending performance is UNITY, Daylight, Brutus and GRIND. Combining methods resulted always in positive synergy within a restricted range of chemical similarity scores. We observed no negative synergy. We also noted that combining three or four methods had only limited added advantage compared to combining just two. In the virtual screening, using the estimated biological similarity for ranking compounds produced more consistent results than using the methods in isolation.
- Ligand-based virtual screening
- Data fusion
- Chemical similarity