David Martin

Dr David Martin

Position: Bioinformatics Scientific Officer
Division: Biological Chemistry and Drug Discovery
Address: College of Life Sciences, University of Dundee, Dundee
Telephone: +44 1382 388704, int ext. 88704
Email: d.m.a.martin@dundee.ac.uk

David explores large biological data sets from next generation sequencing and proteomics, developing new tools to unlock the secrets of life within

Research

David Martin's research focuses on high throughput analysis of large data sets. One major challenge of modern biology is dealing with the increasing rate at which data is produced. This data needs to be captured, managed, interpreted and represented in ways which allow inferences to be drawn by experimentalists.

Automated interpretation of post-translational modification by MS-MS

Protein phosphorylation (the post-translational modification of a protein by addition a phosphate ion at a specific point in a polypeptide chain) is a key mechanism that is used by the cell to control processes such as growth and cell death. In collaboration with Professor Mike Ferguson we are performing a global phosphoproteome analysis of the Trypanosoma brucei parasite.

Mass spectrometry can identify proteins through accurate measurement of the mass of peptide fragments. Subsequent fragmentation of specific peptide ions provides a tandem MS (MS-MS) spectrum, allowing the peptide sequence to be determined following database searching with appropriate software such as MASCOT. This software provides a list of matches in the database and highlights the putative position of any post-translational modification. In order to verify the database match, the MS-MS spectrum must be inspected by an expert as the database search algorithms are not designed for post-translational modification identification. However, complex protein mixtures such as whole cell lysates can provide many hundreds of thousands of spectra, manual validation of which is extremely time consuming.

To automate this process, a post-search validation method has been developed. The database match is re-evaluated against a set of criteria which model the expected chemical behaviour of the peptide fragments. The criteria are stringent so the method provides high confidence hits, allowing the expert to assess the harder spectra which are at present unable to be assessed manually. With this methodology we can at present automatically identify approximately half the phosphorylation sites with negligible false positives. There is plenty of potential for further improvement, giving the possibility of rapid, high throughput phosphoproteome scans.

Sequencing Chromosome 4 of the Potato genome

Genome sequencing raises considerable genomic challenges. Integration of genetic and physical map data, mapping of clones and assembly of sequence data require considerable bioinformatic support. In addition, next generation sequencing technologies such as Solexa and 454 have the potential to radically change the approach taken to obtaining novel genomes.
Potato ACGT

In collaboration with colleagues at the Scottish Crop Research Institute (Dr Glenn Bryan), Teagasc Ireland (Dr Dan Milbourne) and Imperial College London (Dr Gerard Bishop), we will be determining the genomic sequence of potato (Solanum tuberosum) chromosome 4.

The project provides many bioinformatic challenges and opportunities. The strain being sequenced is heterozygous, adding complexity to construction of tiling paths and data integration. There are many strains with different traits which can be mapped to regions of interest, and Potato is closely related to Tomato allowing for direct interspecific comparison. Data at present has been determined using classical Sanger sequencing in a BAC by BAC strategy. We will be exploring the utility of NGS data, both for BAC by BAC approaches and whole genome shotgun assemblies. In addition, new methods and techniques are being developed.

This project is funded by BBSRC grant  BB/F012640/1.

GOtcha - providing qualified functional assignments.

GOtcha [4] is a methodology for rapid functional annotation of gene products. Gene Ontology is an hierarchical description of the function of a gene product. Specific terms are linked to more general terms, providing a tree-like structure. GOtcha makes use of this hierarchy to examine the functional assignment of similar gene products and provide a likelihood for the sequence in question to have that particular function.

GOtcha assignments for an entire genome can be mapped to metabolic pathways, indicating biological processes which may be present or absent in that organism. GOtcha was used in the analysis of the genome data for Plasmodium falciparum (Malaria), Trypanosoma brucei (African Sleeping Sickness), Trypanosoma cruzi (Chagas disease), Leishmania major (Leishmaniasis), and Brugia malayi (elephantiasis).

www.compbio.dundee.ac.uk/gotcha/gotcha.php

Kinomer- Detailed classification of protein kinases

In collaboration with Dr Diego Miranda-Saavedra (now at Cambridge University). Application of subgroup specific Hidden Markov Models for classification and identification of protein kinases.

www.compbio.dundee.ac.uk/kinomer

Detailed classification of SNF2-like helicases

In collaboration with Dr Andrew Flaus (now at University of Galway, Ireland). Calculation and application of subgroup specific Hidden Markov Models for classification and identification of SNF2-like [1] chromatin remodellers.

www.snf2.org
 

 

Publications

  1. Waterhouse AM, Procter JB, Martin DMA, Clamp M and Barton GJ Jalview Version 2 - a multiple sequence alignment editor and analysis workbench Bioinformatics 2009 25 (9) 1189-1191; doi:10.1093/bioinformatics btp033
  2. Martin DMA, Miranda-Saavedra D and Barton GJ Kinomer v. 1.0: A database of systematically classified eukaryotic protein kinases Nucleic Acids Research 2009 37: D244-D250; doi:10.1093/nar/gkn834
  3. Nett IRE, Martin DMA, Miranda-Saavedra D, Lamont D, Barber JD, Mehlert A and Ferguson MAJ The phosphoproteome of bloodstream form Trypanonosoma brucei, causative agent of African Sleeping Sickness Molecular and Cellular Proteomics 2008 (submitted)
  4. Towler MC, Fogarty S, Hawley SA, Pan DA, Martin DMA, Morrice NA, McCarthy A, Galardo MN, Meroni SB, Cigorraga SB, Ashworth A, Sakamoto K and Hardie DG A novel short splice variant of the tumour suppressor  LKB1 is required for spermiogenesis Biochem. J. 2008 416 1–14; doi:10.1042/BJ20081447
  5. Overton IM, van Niekerk CA, Carter LG, Dawson A, Martin DM, Cameron S, McMahon SA, White MF, Hunter WN, Naismith JH, Barton GJ. TarO: a target optimisation system for structural biology. Nucleic Acids Res. 2008 36 W190-W196; doi:10.1093/nar/gkn141
  6. Ghedin E, Wang S, Spiro D, Caler E, Zhao Q, Crabtree J, Allen JE, Delcher AL, Guiliano DB, Miranda-Saavedra D, Angiuoli SV, Creasy T, Amedeo P, Haas B, El-Sayed NM, Wortman JR, Feldblyum T, Tallon L, Schatz M, Shumway M, Koo H, Salzberg SL, Schobel S, Pertea M, Pop M, White O, Barton GJ, Carlow CK, Crawford MJ, Daub J, Dimmic MW, Estes CF, Foster JM, Ganatra M, Gregory WF, Johnson NM, Jin J, Komuniecki R, Dorf I, Kumar S, Laney S, Li BW, Li W, Lindblom TH, Lustigman S, Ma D, Maina CV, Martin DMA, McCarter JP, McReynolds L, Mitreva M, Nutman TB, Parkinson J, Peregrín-Alvarez JM, Poole C, Ren Q, Saunders L, Sluder AE, Smith K, Stanke M, Unnasch TR, Ware J, Wei AD, Weil G, Williams DJ, Zhang Y, Williams SA, Fraser-Liggett C, Slatko B, Blaxter ML, Scott AL. Draft genome of the filarial nematode parasite Brugia malayi. Science. 2007, 317, 1756-60. 
  7. “Identification of multiple distinct Snf2 subfamilies with conserved structural motifs" Flaus A, Martin DMA, Barton, G. J. and Owen-Hughes, T. Nucleic Acids Research (2006) 34, 2887-2905.
  8. "A preliminary crystallographic analysis of the putative mevalonate diphosphate decarboxylase from Trypanosoma brucei" Byres E, Martin DMA, Hunter WN Acta Crystallographica (2005) F61, 581-584
  9. "Retrotransposon populations of Vicia species with varying genome size " Hill P, Burford D, Martin DMA, Flavell AJ Molecular Genetics and Genomics (2005) (online), PMID: 15891910
  10. "GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes." Martin DMA, Berriman M, Barton GJ BMC Bioinformatics. (2004) 5, 178 PMID: 15550167
  11. "ELM server: A new resource for investigating short functional sites in modular eukaryotic proteins." Puntervoll P, Linding R, Gemund C, Chabanis-Davidson S, Mattingsdal M, Cameron S, Martin DMA, Ausiello G, Brannetti B, Costantini A, Ferre F, Maselli V, Via A, Cesareni G, Diella F, Superti-Furga G, Wyrwicz L, Ramu C, McGuigan C, Gudavalli R, Letunic I, Bork P, Rychlewski L, Kuster B, Helmer-Citterich M, Hunter WN, Aasland R, Gibson TJ Nucleic Acids Res. (2003) 31, 3625-30 PMID: 12824381
  12. "Visual representation of database search results: the RHIMS Plot." Martin DMA, Hill P, Barton GJ, Flavell AJ. Bioinformatics. (2003) 19, 1037-8 PMID: 12761069
  13. "Evaluation of annotation strategies using an entire genome sequence." Iliopoulos I, Tsoka S, Andrade MA, Enright AJ, Carroll M, Poullet P, Promponas V, Liakopoulos T, Palaios G, Pasquier C, Hamodrakas S, Tamames J, Yagnik AT, Tramontano A, Devos D, Blaschke C, Valencia A, Brett D, Martin DMA, Leroy C, Rigoutsos I, Sander C, Ouzounis CA Bioinformatics (2003) 19, 717-26 PMID: 12691983
  14. "Genome sequence of the human malaria parasite Plasmodium falciparum." Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW, Carlton JM, Pain A, Nelson KE, Bowman S, Paulsen IT, James K, Eisen JA, Rutherford K, Salzberg SL, Craig A, Kyes S, Chan MS, Nene V, Shallom SJ, Suh B, Peterson J, Angiuoli S, Pertea M, Allen J, Selengut J, Haft D, Mather MW, Vaidya AB, Martin DMA , Fairlamb AH, Fraunholz MJ, Roos DS, Ralph SA, McFadden GI, Cummings LM, Subramanian GM, Mungall C, Venter JC, Carucci DJ, Hoffman SL, Newbold C, Davis RW, Fraser CM, Barrell B. Nature (2002) 419, 498-511 PMID: 12368864