The human genome is predicted to contain ~20,500 protein-coding genes. The encoded proteins are the key players in the body, but the functions and localizations of most proteins are still unknown. Antibody-based proteomics has great potential for exploration of the protein complement of the human genome, but there are antibodies only to a very limited set of proteins. The Human Proteome Resource (HPR) project was launched in August 2003, with the aim to generate high-quality specific antibodies towards the human proteome, and to use these antibodies for large-scale protein profiling in human tissues and cells.
The goal of the work presented in this thesis was to evaluate if antigens can be selected, in a high-throughput manner, to enable generation of specific antibodies towards one protein from every human gene. A computationally intensive analysis of potential epitopes in the human proteome was performed and showed that it should be possible to find unique epitopes for most human proteins. The result from this analysis was implemented in a new web-based visualization tool for antigen selection. Predicted protein features important for antigen selection, such as transmembrane regions and signal peptides, are also displayed in the tool. The antigens used in HPR are named protein epitope signature tags (PrESTs). A genome-wide analysis combining different protein features revealed that it should be possible to select unique, 50 amino acids long PrESTs for ~80% of the human protein-coding genes.
The PrESTs are transferred from the computer to the laboratory by design of PrEST-specific PCR primers. A study of the success rate in PCR cloning of the selected fragments demonstrated the importance of controlled GC-content in the primers for specific amplification. The PrEST protein is produced in bacteria and used for immunization and subsequent affinity purification of the resulting sera to generate mono-specific antibodies. The antibodies are tested for specificity and approved antibodies are used for tissue profiling in normal and cancer tissues. A large-scale analysis of the success rates for different PrESTs in the experimental pipeline of the HPR project showed that the total success rate from PrEST selection to an approved antibody is 31%, and that this rate is dependent on PrEST length. A second PrEST on a target protein is somewhat less likely to succeed in the HPR pipeline if the first PrEST is unsuccessful, but the analysis shows that it is valuable to select several PrESTs for each protein, to enable generation of at least two antibodies, which can be used to validate each other.