ActSeek: Fast and accurate search algorithm of active sites in Alphafold database

Sandra Castillo*, O H Samuli Ollila

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

26 Downloads (Pure)

Abstract

Motivation
Finding proteins with specific functions by mining modern databases can potentially lead to substantial advancements in wide range of fields, from medicine and biotechnology to material science. Currently available algorithms enable mining of proteins based on their sequence or structure. However, activities of many proteins, such as enzymes and drug targets, are dictated by active site residues and their surroundings rather than the overall structure or sequence of a protein.

Results
We introduce ActSeek—a computer vision-inspired fast program—that searches structural databases for proteins with active sites similar to the seed protein. ActSeek is implemented to mine proteins with desired active site environments from the Alphafold database. The potential of ActSeek to find innovative solutions to the world’s most pressing challenges is demonstrated by finding enzymes that may be used to produce biodegradable plastics or degrade plastics, as well as potential off-targets for common drug molecules.

Availability and implementation
ActSeek source code is available in https://github.com/vttresearch/ActSeek under Non-Commercial License Agreement.
Original languageEnglish
Article numberbtaf424
JournalBioinformatics
Volume41
Issue number8
DOIs
Publication statusPublished - Aug 2025
MoE publication typeA1 Journal article-refereed

Funding

Jane and Aatos Erkko Foundation (JAES) under project 220048 (Virtual laboratory for Biodesign, JAESBIODESIGN) and Research Council of Finland for financial support (358505 and 356568).

Keywords

  • Algorithms
  • Databases, Protein
  • Catalytic Domain
  • Software
  • Proteins/chemistry
  • Data Mining/methods
  • Computational Biology/methods

Fingerprint

Dive into the research topics of 'ActSeek: Fast and accurate search algorithm of active sites in Alphafold database'. Together they form a unique fingerprint.

Cite this