Briefings in Bioinformatics Advance Access originally published online on September 25, 2006
Briefings in Bioinformatics 2007 8(1):65-67; doi:10.1093/bib/bbl031
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Book Review |
Essential Bioinformatics
Jin Xiong
Cambridge University Press; ISBN: 0-521-60082-0; 339pp.; 2006;
£65.00 (hardcover); £29.00 (paperback). Few would argue the need for today's college biology majors to have basic skills in bioinformatics. Yet, their undergraduate faculty faces several challenges in providing these skills, particularly at smaller colleges. First, faculty members who teach bioinformatics have usually been trained in molecular biology, genetics or biochemistry. Therefore, most do not have extensive applied mathematics experience beyond statistics. Second, bioinformatics textbooks for undergraduate biology majors are rare. Most bioinformatics books are geared to researchers, computer programmers or graduate students. Others are simple user manuals, with little coverage of critical evaluation of the output. Third, most students today have great point-and-click computing skills, but minimal understanding or patience for command-line computing or programming.
In light of these challenges to introducing undergraduate students to bioinformatics, it was quite a joy to read and review Professor Jin Xiong's recent book, Essential Bioinformatics. This compact, economical, first edition of a textbook was written specifically for the typical Junior or Senior level Life Science undergraduate student, who has a sound background in Genetics, Molecular Biology and Biochemistry, and a semester of statistics, but little-to-no computer programming experience. The language is not specialized, and most computing terms are clearly defined within the text and in a glossary. The emphasis is on understanding the limitations of these computing tools, the importance of using more than one computing tool for analyzing data, and criteria for developing a consensus conclusion when the output differs between those various tools.
Essential Bioinformatics is divided into seven increasingly complex sections, each composed of 24 chapters. The first section introduces databases by describing database structure, formats for data searchable archiving and connections between databases. The remaining six sections are devoted to the processes and computing tools for Sequence Alignment, Gene and Promoter Prediction, Molecular Phylogenies, Structural Bioinformatics, Genomics and Proteomics. From the first section (Biological Databases) Xiong revealed a knack for a thorough but not overwhelming presentation which persisted even through the most difficult chapters on protein structure (including a thorough review of thermodynamic considerations in protein folding) and microarray analysis.
Nearly all chapters are embellished with summary tables as well as clean and simple black and white diagrams. In addition, some figures (such as those illustrating tertiary protein structures, RNA structure, and examples of microarray data analysis) are collated as color plates in the center of the book. The presentation is thorough yet generic enough that changes in web portals should present little confusion for novices. Notably, Xiong reminded readers how more complex tools are built from simpler tools. For example, computing tools that predict protein structure or identify conserved domain structure were developed from earlier tools designed to perform local alignment of linear sequences. Without having intimate knowledge of the design of the alignment programs, students are still able to appreciate the evolution of bioinformatics tools. Finally, Xiong frequently reminds readers that the human brain is indispensable for assessing the output of computing tools, dispelling a common student misconception that computers are better than humans at analyzing complex data.
An important issue in bioinformatics is the understanding that mathematical modeling is used to develop bioinformatics tools. Yet, one of the challenges for biology undergraduates, (and many of the molecular biology or genetics instructors who teach bioinformatics), is a lack of practical experience in mathematics beyond statistical testing of experimental data. Xiong clearly presents mathematical models in a way that helps the reader understand the basis of the algorithms used in bioinformatics tools. Mathematical formulas usually are accompanied by complementary tables and line diagrams making complex derivations unnecessary. An effective example of this presentation appears in Chapter 10, where Xiong uses figures to compare the mathematical consequences of doing phylogenetic analysis of more than three sequences; readers will readily understand the challenges of choosing accuracy and reliability (innate desires of scientists) over computational speed (based on the physical limits of computers). They will then understand why there are so many tools for constructing phylogeny, the importance of resolving errors introduced by selecting faster tools, the challenge of resolving conflicting data, and the importance of statistical testing for validating conclusions.
Each chapter begins with a brief paragraph outlining what the reader can expect to learn, and concludes with a concise summary (usually half a page) that helps the reader determine if they understand the major concepts. Readers interested in delving deeper into chapter topics are directed to several well-chosen references. Although I suspect that most of these references will be too advanced for undergraduates, they will prove a useful resource for those who are learning bioinformatics for research purposes and may therefore need further understanding of mathematical modeling used in the computing tools.
Readers of Essential Bioinformatics can also assess learning by performing six Practical Exercises, all located in Appendix One. Each exercise corresponds to one of the first six sections of the book. Unlike individual stand-alone problems typical of most undergraduate texts, these exercises are more like capstone projects to be completed in stepwise segments, simulating a bioinformatics research project. Some Practical Exercises require raw data provided via the publisher's web site (access requires a user name and password to be acquired from the publisher by the instructor). In some cases, students are asked to repeat some analyses using several different bioinformatics tools, to compare and critically evaluate output, and then to decide on a consensus conclusion.
For this review, I completed two of the six Practical Exercises, Exercise 4: Phylogenetic Analysis and Exercise 6: Gene Prediction (a segment of a longer exercise). The phylogeny exercise required a couple of hours to complete (and will probably take longer for novices), while I was able to complete the gene prediction (of a Heliobacillus mobilis sequence) in less than an hour. The instructions were clear and concise, and all URLs were active (what a treat). The author designed the exercises to be performed on a UNIX workstation but I was able to perform the two aforementioned exercises using MacIntosh OS 10.4. Using a Windows 98 platform, postscript files could not be viewed except by printing, which required installation of additional software. Those unfamiliar with UNIX will benefit from knowing that nedit or scratch files can be saved as simple text (MAC) or notepad (PC) files, and that postscript files can be opened by Adobe Reader, then converted to pdf files (when using MAC OSX).
Even though the Practical Exercises were broken into segments more or less corresponding to chapters, it would be helpful if there were some shorter, warm-up exercises at the end of each chapter before students are presented with the multi-chapter capstone projects. Also, the lack of sample data to explore concepts in Functional Genomics (transcriptome analysis) and Proteomics makes the extensive list and brief descriptions of bioinformatics tools for these processes a reference, rather than a learning resource. A useful addition to a subsequent edition would be samples of microarray and proteome data to analyze and interpret. These data sets can be quite large, presenting a challenge for downloading from a web site. Perhaps a representative subset of larger data sets could be sorted and pared for educational use.
The major criticism I have of this text is an annoying number of errors that should have been caught in the editing process (I found several misspellings as well as many examples of plural verbs with singular nouns or vice versa). Figures in Chapter 10 also had too many errors, which, given the importance of the figures in clarifying the mathematical models, is critical. As an example in Figures 10.6 and 10.7, it appears that a master figure was copied and pasted for modification of the copies, but the modification was forgotten. I noticed Xiong occasionally used synonyms not found in the glossary and some terms biologists do not often encounter were not defined (hash, linear discriminate analysis, discriminate analysis, node and kernel are some examples). Hopefully these will be corrected or improved in a second edition.
In summary, Essential Bioinformatics is a welcome resource for introducing undergraduate students (or those totally new to the field) to the basics of bioinformatics. Enough information about the mathematical workings of tools commonly used in bioinformatics is presented for users to appreciate how data is being analyzed. Clean, clear figures clarify complex mathematical models. Importantly, students reading this text will understand the limits of bioinformatics tools (most imposed by constraints of computational time) and realize that computers cannot replace humans in critically evaluating output. Finally, the capstone exercises illustrate the interdisciplinary nature of the field and assess student understanding of most topics. Chapter topics and Practical Exercises represent diverse areas where bioinformatics is crucial to analyzing data. These have the added potential to stimulate interesting classroom discussion and generate interest for future studies in bioinformatics. Thank you Professor Xiong for this welcome teaching and learning resource!
Associate Professor of Biology,
University of Indianapolis,
1400 E Hanna Ave,
Indianapolis, IN 46227, USA
E-mail: mritke{at}uindy.edu
Submitted: August 21, 2006. Accepted: August 24, 2006.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||