Effector proteins or effectors play a crucial role in microbial infection of plants, animals, and humans. These proteins are secreted by the pathogens across microbial and host cell membranes interacting with the host proteins, and altering the functioning of the host system to promote entry, survival, and replication of the pathogen inside the host cell. Determining effectors from a set of microbial genes is one of the first key steps in studying host-pathogen interactions. Here, we propose an accurate computational approach to predict bacterial effectors across different secretion systems by integrating protein sequences, structures, and genomic information. The approach employs a statistical framework to integrate independent feature-based machine learning classifiers trained on a set of experimentally determined microbial effectors. Each of the classifier is designed to test an independent hypothesis about the presence of an effector-specific signal in the protein sequence.  Specifically, we test whether effector proteins harbor any signal in (i) N-terminal, (ii) C-terminals or (iii) anywhere in the protein sequence. For each classifier, we analyzed various features of effector proteins and trained a Support Vector Machine (SVM) to predict if a protein is an effector or non-effector.

The individual SVMs, predicted the effectors with an accuracy of 97% - 98% and precision of 97% to 98%; the combination resulted in the accuracy of 97% and precision of 98%. We demonstrated the whole-genome scale applicability of our method by applying it to five bacterial genomes. The diseases caused by the bacteria – Legionella pneumophila, Acinetobacter baumannii, Helicobacter pylori, Mycobacterium ulcerans, Chlamydia trachomatis, and Mycobacterium tuberculosis genome, are among the deadliest tropical diseases of the third world countries as well as US. Computational prediction of secreted effectors from protein sequences represents an important step towards better understanding the interactions between pathogens and hosts.

 

 

Page rendered in 0.0464 seconds