Dr David Martin FRSB FHEA
Research
David Martin's research focused on high throughput analysis of large data sets. One major challenge of modern biology is dealing with the increasing rate at which data is produced, and extracting biological meaning from it. This data needs to be captured, managed, interpreted and represented in ways which allow inferences to be drawn by experimentalists.
David spends most of his time teaching but former projects include the following:
Automated interpretation of post-translational modification by MS-MS
Protein phosphorylation (the post-translational modification of a protein by addition a phosphate ion at a specific point in a polypeptide chain) is a key mechanism that is used by the cell to control processes such as growth and cell death. In collaboration with Professor Mike Ferguson we are performing a global phosphoproteome analysis of the Trypanosoma brucei parasite.
Mass spectrometry can identify proteins through accurate measurement of the mass of peptide fragments. Subsequent fragmentation of specific peptide ions provides a tandem MS (MS-MS) spectrum, allowing the peptide sequence to be determined following database searching with appropriate software such as MASCOT. This software provides a list of matches in the database and highlights the putative position of any post-translational modification. In order to verify the database match, the MS-MS spectrum must be inspected by an expert as the database search algorithms are not designed for post-translational modification identification. However, complex protein mixtures such as whole cell lysates can provide many hundreds of thousands of spectra, manual validation of which is extremely time consuming.
To automate this process, a post-search validation method has been developed. The database match is re-evaluated against a set of criteria which model the expected chemical behaviour of the peptide fragments. The criteria are stringent so the method provides high confidence hits, allowing the expert to assess the harder spectra which are at present unable to be assessed manually. With this methodology we can at present automatically identify approximately half the phosphorylation sites with negligible false positives. There is plenty of potential for further improvement, giving the possibility of rapid, high throughput phosphoproteome scans.
Sequencing Chromosome 4 of the Potato genome
Genome sequencing raises considerable genomic challenges. Integration of genetic and physical map data, mapping of clones and assembly of sequence data require considerable bioinformatic support. In addition, next generation sequencing technologies such as Solexa and 454 have the potential to radically change the approach taken to obtaining novel genomes.
In collaboration with colleagues at the Scottish Crop Research Institute (Dr Glenn Bryan), Teagasc Ireland (Dr Dan Milbourne) and Imperial College London (Dr Gerard Bishop), we will be determining the genomic sequence of potato (Solanum tuberosum) chromosome 4.
The project provides many bioinformatic challenges and opportunities. The strain being sequenced is heterozygous, adding complexity to construction of tiling paths and data integration. There are many strains with different traits which can be mapped to regions of interest, and Potato is closely related to Tomato allowing for direct interspecific comparison. Data at present has been determined using classical Sanger sequencing in a BAC by BAC strategy. We will be exploring the utility of NGS data, both for BAC by BAC approaches and whole genome shotgun assemblies. In addition, new methods and techniques are being developed.
This project is funded by BBSRC grant BB/F012640/1.
GOtcha - providing qualified functional assignments.
GOtcha [4] is a methodology for rapid functional annotation of gene products. Gene Ontology is an hierarchical description of the function of a gene product. Specific terms are linked to more general terms, providing a tree-like structure. GOtcha makes use of this hierarchy to examine the functional assignment of similar gene products and provide a likelihood for the sequence in question to have that particular function.
GOtcha assignments for an entire genome can be mapped to metabolic pathways, indicating biological processes which may be present or absent in that organism. GOtcha was used in the analysis of the genome data for Plasmodium falciparum (Malaria), Trypanosoma brucei (African Sleeping Sickness), Trypanosoma cruzi (Chagas disease), Leishmania major (Leishmaniasis), and Brugia malayi (elephantiasis).
Kinomer- Detailed classification of protein kinases
In collaboration with Dr Diego Miranda-Saavedra (now at Cambridge University). Application of subgroup specific Hidden Markov Models for classification and identification of protein kinases.
Detailed classification of SNF2-like helicases
In collaboration with Dr Andrew Flaus (now at University of Galway, Ireland). Calculation and application of subgroup specific Hidden Markov Models for classification and identification of SNF2-like [1] chromatin remodellers.
Teaching
David is part of the core level 1 & 2 teaching team, delivering a core curriculum across all Life Sciences degrees. His particular focus is on numeracy and data literacy, bringing quantitative methodologies into the approach that students take as part of their developing roles as active scientists. Students are encouraged to build up their data analysis skills as a key tool in enabling them to question and evaluate the world around them.
Modules taught:
Core level 1 and 2 modules
BS11005 Introduction to Maths, Physics and Chemistry
BS31003 Molecular Structure and Interactions
BS32010 Applied Bioinformatics
BS32011/2 Bioinformatics practical project
BS42003 Advanced Bioinformatics
Modules managed:
BS21002 The Cell and the Gene
BS22002 Biological Sciences
BS32010 Applied Bioinformatics
BS42003 Advanced Bioinformatics
Key Teaching Achievements:
Nominated for Student Led Teaching Awards 2014
Produced 'Lost in Translation', a board game for 1st year/A2 students to reinforce understanding of DNA transcription and translation
Produced Gigsaw - a teaching tool for understanding Next Generation Sequence analysis
Other roles:
D'Arcy Thompson Unit Divisional Representative for the Division of Plant Sciences.
Publications
My ORCID is http://orcid.org/0000-0002-8732-204X
34. Urbaniak, MD. Martin, DMA Ferguson, MAJ. Global Quantitative SILAC Phosphoproteomics Reveals Differential Phosphorylation Is Widespread between the Procyclic and Bloodstream Form Lifecycle Stages of Trypanosoma brucei Journal Of Proteome Research 2013 12 2233-2244
33. Nelson, SA. Li, Z Newton, IP. Fraser, D Milne, RE. Martin, DMA. Schiffmann, D, Yang, X, Dormann, D, Weijer, CJ. Appleton, PL. Naethke, IS.Tumorigenic fragments of APC cause dominant defects in directional cell migration in multiple model systems Disease Models & Mechanisms 2012 5 940-947
31. Potato Genome Sequencing Consortium Genome sequence and analysis of the tuber crop potato Nature 2011 475 189-194
30. van Koningsbruggen S, Gierlinski M, Schofield P, Martin DMA, Barton GJ, Ariyurek Y, den Dunnen JT, Lamond AI High-resolution whole-genome sequencing reveals that specific chromatin domains from most human chromosomes associate with nucleoli Molecular Biology of the Cell 2010 21(21) 3735-48.
29. Martin DMA, Nett IRE, Vandermoere F, Barber JD, Morrice N and Ferguson MAJ Prophossi: Automating Expert Validation of Phosphopeptides from Tandem Mass Spectrometry Bioinformatics 2010 26 (17), 2153-2159
28. Nett IRE, Martin DMA, Miranda-Saavedra D, Lamont D, Barber JD, Mehlert A and Ferguson MAJ The phosphoproteome of bloodstream form Trypanonosoma brucei, causative agent of African Sleeping Sickness Molecular and Cellular Proteomics 2008 8(7) 1527-38
27. Waterhouse AM, Procter JB, Martin DMA, Clamp M and Barton GJ Jalview Version 2 - a multiple sequence alignment editor and analysis workbench Bioinformatics 2009 25 (9) 1189-1191; doi:10.1093/bioinformatics/btp033
26. Martin DMA, Miranda-Saavedra D and Barton GJ Kinomer v. 1.0: A database of systematically classified eukaryotic protein kinases Nucleic Acids Research 2009 37: D244-D250; doi:10.1093/nar/gkn834
25. Towler MC, Fogarty S, Hawley SA, Pan DA, Martin DMA, Morrice NA, McCarthy A, Galardo MN, Meroni SB, Cigorraga SB, Ashworth A, Sakamoto K and Hardie DG A novel short splice variant of the tumour suppressor LKB1 is required for spermiogenesis Biochem. J. 2008 416 1–14; doi:10.1042/BJ20081447
24. Overton IM, van Niekerk CA, Carter LG, Dawson A, Martin DM, Cameron S, McMahon SA, White MF, Hunter WN, Naismith JH, Barton GJ. TarO: a target optimisation system for structural biology. Nucleic Acids Res. 2008 36 W190-W196; doi:10.1093/nar/gkn141
23. Ghedin E, Wang S, Spiro D, Caler E, Zhao Q, Crabtree J, Allen JE, Delcher AL, Guiliano DB, Miranda-Saavedra D, Angiuoli SV, Creasy T, Amedeo P, Haas B, El-Sayed NM, Wortman JR, Feldblyum T, Tallon L, Schatz M, Shumway M, Koo H, Salzberg SL, Schobel S, Pertea M, Pop M, White O, Barton GJ, Carlow CK, Crawford MJ, Daub J, Dimmic MW, Estes CF, Foster JM, Ganatra M, Gregory WF, Johnson NM, Jin J, Komuniecki R, Korf I, Kumar S, Laney S, Li BW, Li W, Lindblom TH, Lustigman S, Ma D, Maina CV, Martin DMA, McCarter JP, McReynolds L, Mitreva M, Nutman TB, Parkinson J, Peregrín-Alvarez JM, Poole C, Ren Q, Saunders L, Sluder AE, Smith K, Stanke M, Unnasch TR, Ware J, Wei AD, Weil G, Williams DJ, Zhang Y, Williams SA, Fraser-Liggett C, Slatko B, Blaxter ML, Scott AL. Draft genome of the filarial nematode parasite Brugia malayi. Science. 2007, 317, 1756-60
22. Flaus A, Martin DMA, Barton GJ, Owen-Hughes T Identification of multiple distinct Snf2 subfamilies with conserved structural motifs. Nucleic Acids Res. 2006, 34, 2887-905.
21: Berriman M, Ghedin E, Hertz-Fowler C, Blandin G, Renauld H, Bartholomeu DC, Lennard NJ, Caler E, Hamlin NE, Haas B, Bohme U, Hannick L, Aslett MA, Shallom J, Marcello L, Hou L, Wickstead B, Alsmark UC, Arrowsmith C, Atkin RJ, Barron AJ, Bringaud F, Brooks K, Carrington M, Cherevach I, Chillingworth TJ, Churcher C, Clark LN, Corton CH, Cronin A, Davies RM, Doggett J, Djikeng A, Feldblyum T, Field MC, Fraser A, Goodhead I, Hance Z, Harper D, Harris BR, Hauser H, Hostetler J, Ivens A, Jagels K, Johnson D, Johnson J, Jones K, Kerhornou AX, Koo H, Larke N, Landfear S, Larkin C, Leech V, Line A, Lord A, Macleod A, Mooney PJ, Moule S, Martin DMA, Morgan GW, Mungall K, Norbertczak H, Ormond D, Pai G, Peacock CS, Peterson J, Quail MA, Rabbinowitsch E, Rajandream MA, Reitter C, Salzberg SL, Sanders M, Schobel S, Sharp S, Simmonds M, Simpson AJ, Tallon L, Turner CM, Tait A, Tivey AR, Van Aken S, Walker D, Wanless D, Wang S, White B, White O, Whitehead S, Woodward J, Wortman J, Adams MD, Embley TM, Gull K, Ullu E, Barry JD, Fairlamb AH, Opperdoes F, Barrell BG, Donelson JE, Hall N, Fraser CM, Melville SE, El-Sayed NM. The genome of the African trypanosome Trypanosoma brucei. Science 2005, 309 416-22
20: Byres E, Martin DMA, Hunter WN. A preliminary crystallographic analysis of the putative mevalonate diphosphate decarboxylase from Trypanosoma brucei Acta Crystallographica 2005, F61 581-584
19: Hill P, Burford D, Martin DMA, Flavell AJ. Retrotransposon populations of Vicia species with varying genome size. Molecular Genetics and Genomics 2005, 273 371-81
18: Martin DM, Berriman M, Barton GJ. GOtcha: a new method for prediction of protein function assessed by the
annotation of seven genomes. BMC Bioinformatics. 2004, 5 178
17: Puntervoll P, Linding R, Gemund C, Chabanis-Davidson S, Mattingsdal M,
Cameron S, Martin DM, Ausiello G, Brannetti B, Costantini A, Ferre F, Maselli V,
Via A, Cesareni G, Diella F, Superti-Furga G, Wyrwicz L, Ramu C, McGuigan C,
Gudavalli R, Letunic I, Bork P, Rychlewski L, Kuster B, Helmer-Citterich M,
Hunter WN, Aasland R, Gibson TJ. ELM server: A new resource for investigating short functional sites in modular
eukaryotic proteins. Nucleic Acids Res. 2003, 31 3625-30.
16: Martin DM, Hill P, Barton GJ, Flavell AJ. Visual representation of database search results: the RHIMS Plot.
Bioinformatics. 2003, 19 1037-8.
15: Iliopoulos I, Tsoka S, Andrade MA, Enright AJ, Carroll M, Poullet P,
Promponas V, Liakopoulos T, Palaios G, Pasquier C, Hamodrakas S, Tamames J,
Yagnik AT, Tramontano A, Devos D, Blaschke C, Valencia A, Brett D, Martin D,
Leroy C, Rigoutsos I, Sander C, Ouzounis CA. Evaluation of annotation strategies using an entire genome sequence. Bioinformatics. 2003, 19 717-26.
14: Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW, Carlton JM, Pain
A, Nelson KE, Bowman S, Paulsen IT, James K, Eisen JA, Rutherford K, Salzberg
SL, Craig A, Kyes S, Chan MS, Nene V, Shallom SJ, Suh B, Peterson J, Angiuoli S,
Pertea M, Allen J, Selengut J, Haft D, Mather MW, Vaidya AB, Martin DM, Fairlamb
AH, Fraunholz MJ, Roos DS, Ralph SA, McFadden GI, Cummings LM, Subramanian GM,
Mungall C, Venter JC, Carucci DJ, Hoffman SL, Newbold C, Davis RW, Fraser CM,
Barrell B. Genome sequence of the human malaria parasite Plasmodium falciparum.
Nature. 2002, 419 498-511.
13: Martin DM, Wiiger MT, Prydz H. Tissue factor and biotechnology.
Thromb Res. 1998, 90 1-25.
12: Ashton AW, Kemball-Cook G, Johnson DJ, Martin DM, O'Brien DP, Tuddenham EG,
Perkins SJ. Factor VIIa and the extracellular domains of human tissue factor form a compact
complex: a study by X-ray and neutron solution scattering.
FEBS Lett. 1995, 374 141-6.
11: Martin DM, Boys CW, Ruf W. Tissue factor: molecular recognition and cofactor function.
FASEB J. 1995, 9 852-9
10: Ruf W, Kelly CR, Schullek JR, Martin DM, Polikarpov I, Boys CW, Tuddenham
EG, Edgington TS. Energetic contributions and topographical organization of ligand binding
residues of tissue factor. Biochemistry. 1995, 34 6310-5.
9: O'Brien DP, Kemball-Cook G, Hutchinson AM, Martin DM, Johnson DJ, Byfield
PG, Takamiya O, Tuddenham EG, McVey JH. Surface plasmon resonance studies of the interaction between factor VII and tissue factor. Demonstration of defective tissue factor binding in a variant FVII molecule (FVII-R79Q).
Biochemistry. 1994, 33 14162-9.
8: Harlos K, Martin DM, O'Brien DP, Jones EY, Stuart DI, Polikarpov I, Miller
A, Tuddenham EG, Boys CW. Crystal structure of the extracellular region of human tissue factor.
Nature. 1994, 370 662-6.
7: Martin DM, Tuddenham EG. Activation of factor X by factor VIIa on monocyte cell surfaces.
Blood. 1994, 83 3828-9.
6: Martin DM, O'Brien DP, Tuddenham EG, Byfield PG. Synthesis and characterization of wild-type and variant gamma-carboxyglutamic acid-containing domains of factor VII. Biochemistry. 1993, 32 13949-55.
5: Boys CW, Miller A, Harlos K, Martin DM, Tuddenham EG, O'Brien DP. Crystallization and preliminary X-ray analysis of human tissue factor extracellular domain. J Mol Biol. 1993, 234 1263-5.
4: Takamiya O, Kemball-Cook G, Martin DM, Cooper DN, von Felten A, Meili E,
Hann I, Prangnell DR, Lumley H, Tuddenham EG, et al. Detection of missense mutations by single-strand conformational polymorphism (SSCP) analysis in five dysfunctional variants of coagulation factor VII.
Hum Mol Genet. 1993, 2 1355-9.
3: O'Brien DP, Anderson JS, Martin DM, Byfield PG, Tuddenham EG. Structural requirements for the interaction between tissue factor and factor VII: characterization of chymotrypsin-derived tissue factor polypeptides.
Biochem J. 1993, 292 7-12.
2: Cooke RM, Carter BG, Martin DM, Murray-Rust P, Weir MP. Nuclear magnetic resonance studies of the snake toxin echistatin. 1H resonance assignments and secondary structure. Eur J Biochem. 1991, 202 323-8.
1: Gould H, Sutton B, Beavil A, Edmeades R, Martin D. Immunoglobulin E receptors. Clin Exp Allergy. 1991, 21 138-47
Impact
As well as the research tools listed above, my key impacts have been:
Developing novel teaching tools that have been very well received.
- http://www.lifesci.dundee.ac.uk/studying/media/videos/molecular-biology-...
- http://www.compbio.dundee.ac.uk/gigsaw
Writing the user manual for key bioinformatics software
Why I Teach
The best thing in life is discovering things that you never knew before. Every day is a learning day. As a researcher it was and is my priviledge to discover things that nobody has ever known about how the world around us is made and works. But that is magnified if you can share it. Teaching is a key component of research, and it is a priviledge to be able to open the eyes of the next generation to see the world around them in new ways. In Dundee we have great students and it is a pleasure to teach and train them to change the world.
What I Teach
Statistics/Data analysis:
We use Rstudio as the core technology for all our data analysis and statistical analysis. Reproducibilty is key to effective science so students are encouraged to include their R scripts in project reports and analyses so that errors or misunderstandings can be cleared up, or so we can learn new ways of doing things when they have found something really cool.
Techniques and Tools
Wherever possible i like students to perform real research instead of 'toy' practicals. We should contribute to knowledge as we learn.
We use Jalview extensively as a workbench for sequence analysis and comparison form the beginning of our course. In addition chimera is used for protein visualisation. Where possible, all the data analysis and visualisation softwre is cross platform and freely available - tools and knowledge the students cna take with them.
I introduce students to a taster of Python programming in year 2. For thiose keen to pursue this further, the bioinformatics 3rd year modules extend skills and abilities in Python and R applied to current activities including next generation sequencing to analyse gene expression. In fourth year we broaden knowledge by looking at application of bioinformatics algorithms across different subject areas.
Student Projects:
I supervise a number of students in their honours year project. This can be a lab project where we apply bioinformatics to a real problem and maybe test our hypotheses in the lab, or science communication where students develop and deliver educational activities to an appropriate audience.
Selected recent student projects:
- Identification and validation of novel microsatellite markers in the Soprano Pipistrelle Pipistrellus pygmaeus
- A card game to educate on crop development and GMO technology
Where I Come From
I grew up in London and studied Chemistry with Biochemistry at Kings' College London.
During my degree I took an industrial year with Glaxo where I worked on HIV protease and snake venom proteins. This led to an interest in protein structure and function and a PhD at the MRC Clinical Research Centre in the group of Professor Ted Tuddenham working on blood coagulation proteins.
After completing my PhD I moved to the Biotechnology Centre at the University of Oslo, discovered that practical molecular biology was probably not my strong point and transferred to bioinformatics, working on the EU GeneQuiz project. This was my first experience of a multinational EU project.
After a short period as manager of the Norwegian EMBnet node I moved to Dundee in 2001, working in a number of different projects. In 2013 I moved across campus to take up a full time position in Learning and Teaching with the aim of developing the data literacy (numeracy/stats/bioinformatics) aspect of the curriculum to the level required for a modern life scientist.