What makes E. coli 0104 so deadly?
24 Apr 2012 by Evoluted New Media
In May last year a major outbreak of foodbourne illness was caused by E.coli. Genomic and mass spec technologies could combine to detect such outbreaks in the future. Here the Health Protection Agency and Thermo Fisher Scientific discuss using state-of-the-art Nano-LC-MS/MS technology to investigate the toxicity of microoganisms
Microbial diagnostic laboratories perform analyses in order to accurately identify a pathogen and to understand an organism's ability to produce toxic proteins including those involved in infectious diseases. The ability to identify such proteins of significance in a disease process may be diagnostically or clinically applicable to a wide range of illnesses caused by microorganisms. Consequently, the ultimate aim of microbial diagnostics is to significantly reduce risk to human health and provide more effective treatment options. Advanced nano-LC-MS/MS technology in concert with sophisticated software and bioinformatics tools has been primarily developed to enable scientists to identify the expressed protein complement of a cell or tissue sample (proteomics) and thus can be applied to microbiological samples to analyse their specific protein fingerprint of infectious microorganisms. Current MS-based methods used for microbial diagnostics rely on matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI TOF MS) profiles. A drawback of MALDI TOF MS is its inability to identify the ions that characterise the mass spectrum of the pathogen.
The Department for Bioanalysis and Horizon Technologies of the UK’s Health Protection Agency (HPA) in collaboration with Thermo Fisher Scientific are exploring the use of proteomics to better understand approaches to characterise infectious microorganisms. One recent example of proteomics approaches using nano-LC MS/MS technology is the collaborative work between the HPA and Thermo Fisher Scientific in understanding the toxicity and pathology of the E. coli O104:H4 strain, the causative agent in a recent, deadly E. coli outbreak in Europe.
The team at the HPA has been actively engaged in searching for unique, stable protein signatures derived against a background of genome sequences of a range of pathogens compared to related non-pathogenic species. This has enabled the establishment of a pipeline for biomarker discovery, integrating protein sequencing with mass spectrometry and bioinformatics technology1. In brief, the procedure involves efficient cell lysis, protein separation using sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE), digestion of proteins to peptides with proteases and nano-liquid chromatography (nano-LC) coupled to a mass spectrometer producing MS/MS spectra of peptides which spectra can then be used to identify the protein. These are analysed against the genome database to characterise expressed prevalent markers and to identify genus, species and clonal types. A feasibility project undertaken collaboratively by the HPA and Thermo Fisher Scientific is now exploring the usability of this new procedure as a high resolution solution for profiling a broad range of human pathogens through a program to streamline separation, improve comparative display and enhance coverage of markers across the proteome.
In May 2011, a serious outbreak of foodborne illness which included diarrhoea and haemolytic uremic syndrome (HUS) occurred and spread to neighbouring countries. The use of detective clinical microbiology soon led to the identification of the culprit E. coli strain as the causing factor of the outbreak, which resulted in more than 4000 reported cases and approximately 50 deaths. E. coli are mainly commensal organisms, but several pathotypes of diarrhoeagenic E. coli can be discerned by phenotypic and genetic traits. A key characteristic of the outbreak was that the HUS manifestations were similar to those triggered by Shiga toxin producing E. coli. Diarrhoea leading to HUS is traditionally associated with Shiga toxin production, and the strains referred to as EHEC (enterohaemorrhagic E. coli) contain the enterocyte effacement pathogenicity island. However, serotype characterisation of the German outbreak strain matched it to serotype 0104:H4, which is a classical enteroaggregative pathotype of E. coli that do not normally harbour Shiga toxin but have aggregatin (adhesion) factors of the enteroaggregative E. coli (EAEC). In addition, the strain lacked the enterocyte effacement pathogenicity island, making it even more difficult to achieve rapid and accurate characterisation of the strain. Thus, there was confusion as to the cause of the outbreak.
Genome sequencing was undertaken at the Beijing Genomics Institute, China, and at the University of Münster/Life Technologies, Germany, using rapid short reads that revealed the content genes of the strain. The HPA then released their own de novo assembly of the E. coli isolate H112180280 using paired end sequencing and the long-read capability of the Roche GS Junior System generated in house, and assembled the entire genome and plasmids2. This enabled rapid re-assembly of genomes sequenced globally and confirmed that all the isolates originated from one strain. It was concluded that the core genome is made up of 5,224,248bp (see Figures). Additionally, there are two large plasmids P1of 87,140bp and P2 of 70,233bp. The location of the Shiga toxin harboring phage was also identified.
[caption id="attachment_27706" align="alignright" width="300" caption="Figure 1: Genome view of E. coli O104:H4, highlighting key virulence factors associated with the chromosome (figure 1) and the aggregative plasmid (figure 2)"][/caption]
HPA and Thermo Fisher Scientific researchers wanted to investigate how the genome sequencing findings could be translated to develop an MS-based approach to detect the outbreak and discover new mosaic strains in the future. The genomic data revealed how gene transfers between pathotype and serotype of E. coli can lead to the emergence of a new potent pathogen. In addition, the data allowed scientists to re-examine previous isolates of serotype 0104 and it became apparent that this was not a unique event. There are previous reports of 0104 strains taking up Shiga toxin genes and genetic transfer events will continue to lead to a mosaic of new pathogens. Taking this into consideration, it was concluded that evolving proteomics-based diagnostics should incorporate not only signature markers of the genus and species, but also markers of virulence within a species or pathotype in order to identify emerging pathogens that cause such devastating outbreaks. In response, the pipeline of signature detection was modified using MS to extract virulence signatures (characterised from comparative genetics databases) and adapt them to the identification protocol.
A total of five E. coli strains belonging to serotype 0104 were subjected to proteomic analysis. These were, three clinical isolates from patients affected by the German outbreak (genome sequence of all three isolates confirmed that they were from the same strain) and two others previously characterised as serotype 0104 but have EAEC and EHEC genetic composition respectively. All strains were cultured on LB broth and agar and harvested prior to employing two parallel approaches for reducing complexity of the mixture and MS analysis. In the first approach, lysates were separated by SDS-PAGE and gel slices were digested with trypsin. Peptides were analysed using nano-LC-MS/MS. In the second approach, the entire cell lysate was digested directly with trypsin in solution and injected onto two LC-MS/MS systems. The peptide MS/MS spectra were matched to both protein and in silico genome-translated databases to yield protein identification. These were then fed into the bioinformatics pipeline1 to acquire unique signatures at the genus and species. An extensive list of identified peptides was then searched, using Blast and Scaffold, for virulence determinants, E. coli virulence factors, and putative EHEC and EAEC-specific virulence markers.
[caption id="attachment_27707" align="alignleft" width="300" caption="Figure 2"][/caption]
Using cells grown on plates or broth produced an extensive list of protein signatures. Combining the peptide list from plates and broth resulted in a list of proteins covering much of the predicted open reading frames of the sequenced outbreak strain genome. This reflects the sensitivity and reliability of the nano-LC-MS/MS method to yield protein profiles using selective or enriched culture preparation. The aim was then to investigate whether peptides resulting from high abundance proteins have unique markers and signatures for genus and species or virulence detection. Experimental results demonstrate that this was achieved and confirmed by comparing the mosaic, new outbreak strain virulence signature to both EAEC and EHEC protein signatures treated using the same proteome approach. From the analysis of outbreak strains, similar profiles were detected. Data for isolate E. coli O104:H4 strain 280 (the first genome to be assembled into near complete topology) is shown in Figure 1. A total of approximately 2,500 proteins from the outbreak isolates were identified. These were mapped into genus, species and strain-specific markers (not shared by other bacterial strain sequences in the protein database). A total of 68 peptide signatures were identified which delineate the outbreak E. coli isolates, separating them from other closely related Enterobacteriaceae. At the species level, specific peptide signatures such as MSAIKIR (AggR transcription factor), EGDQLLR (haemolysin protein), LRSTDSR (Aaf fimbriae protein) and FTQNYSSLSAVQK (Iha adhesion protein) were detected. In total, 3,031 peptides were identified as unique to the outbreak strains when compared against control isolates. In addition, the Shiga-toxins and Pic serine protease (autotransporter toxin) were detected. Encouragingly, genomic features such as tellurium resistance genetically detected in the outbreak strain were identified via proteomics. The list of peptides was subsequently filtered using the pipeline described above to remove peptides derived from physiological and regulatory proteins, thus reducing the complexity of the list. Search of the residual list for E. coli pathotype virulence determinants and virulence factors resulted in a definitive list of expressed virulence determinants of the outbreak strain. Figure 2 shows these results diagrammatically. The results also support the view that the background genome (biome) came from an EAEC progenitor that acquired plasmids and prophages, and exchanged chromosomal loci leading to the emergence of an aggressive strain with a distinctive profile. All strains shared 89 % of the expressed proteins. A total of 31 proteins were encoded by the two plasmids. Peptide signatures for adhesion and multidrug resistance (including ?-lactamase, CTX-M extended spectrum ?-lactamase and Metallo-?-lactamase enzymes) were observed.
Nano-LC-MS/MS technology helps scientists to better understand the role of pathogenic microorganisms in causing illness and disease in humans. Using this powerful technique, scientists are able to study microorganisms and determine how the genetic code is being translated into the protein building blocks that determine traits such as toxicity. The method has been implemented in this study in order to determine what makes the E. coli O104 strain so devastating and how it may be controlled. Experimental results demonstrate that nano-LC-MS/MS accelerates research to identify the sources of E. coli-related illnesses and diseases. The technique was able to identify a significant number of proteins with no requirement for enrichment, selective media or antibiotic incorporation. The authors are of the opinion that proteomic analysis based on nano-LC MS/MS of microorganisms provides definitive characterisation at the genus, species and often strain level. Furthermore, it will enable the detection of pathogenic determinants and antibiotic resistance mechanisms.
The Authors Haroun N. Shah, Min Fang, Raju Misra, Tom Gaulton, Nadia Ahmod, Renata Culak and Saheer E. Gharbia from the HPA, and Martin Hornshaw, Jenny Ho, and Ali Ball, from Thermo Fisher Scientific
References
1. Al-Shahib et al., 2010. Coherent pipeline for biomarker discovery using mass spectrometry and bioinformatics. BMC Bioinformatics, 11:437.
2. Health Protection Agency, ‘HPA scientists unlock secrets of E. coli outbreak strain’, 13 June 2011,http://www.hpa.org.uk/NewsCentre/NationalPressReleases/ 2011PressReleases/110613Ecoligenome/