Abstract
Type IV secretion systems (T4SSs) are employed by pathogenic bacteria to inject proteins known as Type IV secreted effectors (T4SEs) into both prokaryotic and eukaryotic cells. These effectors play a crucial role in bacterial virulence by disrupting host cell functions and immune responses. While extensive research has focused on classifying T4SEs, the application of unsupervised learning techniques in this domain remains unexplored. In this study, we applied six unsupervised machine learning algorithms to a dataset of T4SEs and non-effectors to identify distinct clusters. Our findings suggest that unsupervised learning holds potential for gaining a deeper understanding of T4SS mechanisms and the diverse properties of T4SEs. Among the clustering algorithms that utilized in this study, it has been observed that the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm generally achieved the highest performance values in terms of average silhouette score and Davies-Bouldin Index (DBI) evaluation metrics. The Laplacian score feature selection algorithm and Principal Component Analysis (PCA) were found to have a positive effect on performance when used in conjunction with amino acid composition (AAC) as a feature extraction method. Additionally, it can be concluded that glycine (G) and lysine (K) are informative amino acids in the clustering algorithms that lead to the formation of two clusters.
Original language | English |
---|---|
Title of host publication | 2024 4th International Conference on Applied Artificial Intelligence (ICAPAI) |
Publisher | IEEE Institute of Electrical and Electronic Engineers |
Pages | 1-6 |
Number of pages | 6 |
ISBN (Electronic) | 979-8-3503-4976-4 |
ISBN (Print) | 979-8-3503-4977-1 |
DOIs | |
Publication status | Published - 16 Apr 2024 |
MoE publication type | A4 Article in a conference publication |
Event | 4th International Conference on Applied Artificial Intelligence, ICAPAI 2024 - Halden, Norway Duration: 16 Apr 2024 → 16 Apr 2024 |
Conference
Conference | 4th International Conference on Applied Artificial Intelligence, ICAPAI 2024 |
---|---|
Country/Territory | Norway |
City | Halden |
Period | 16/04/24 → 16/04/24 |
Keywords
- Proteins
- Measurement
- Machine learning algorithms
- Noise
- Clustering algorithms
- Feature extraction
- Amino acids