The identification of T cell epitopes from immune relevant antigens of
Mycobacterium tuberculosis is a critical step in the development of a vaccine covering diverse populations. Two multigene families, PE-PGRS and PPE make up about 10% of the
M. tuberculosis genome. However, the functions of the proteins coded by these large numbers of genes are unknown. All possible nonameric peptide sequences from PE and PPE proteins were analysed in silico for their ability to bind to 33 alleles of class I HLA. These results reveal that of all PE and PPE proteins, a significant number of these peptides are predicted to be high-affinity HLA binders, irrespective of the length of the protein. The pathogen peptides that could behave as self or partially self-peptides in the host were eliminated using a comparative study with human proteome, thus reducing the number of peptides for analysis. The structural basis for recognition of the nonamers by the respective HLA molecules thus predicted was analyzed by molecular modeling. The structural analysis showed good correlation with the binding prediction. The analysis also led to an understanding of the binding profile of the peptides with respect to different alleles of class I HLA. The predicted epitopes can be tested experimentally for their inclusion in a potential vaccine against tuberculosis that is HLA haplotype-specific.