Beginning Perl for Bioinformatics (Paperback)

James Tisdall



With its highly developed capacity to detect patterns in data, Perl has become one of the most popular languages for biological data analysis. But if you're a biologist with little or no programming experience, starting out in Perl can be a challenge. Many biologists have a difficult time learning how to apply the language to bioinformatics. The most popular Perl programming books are often too theoretical and too focused on computer science for a non-programming biologist who needs to solve very specific problems.

Beginning Perl for Bioinformatics is designed to get you quickly over the Perl language barrier by approaching programming as an important new laboratory skill, revealing Perl programs and techniques that are immediately useful in the lab. Each chapter focuses on solving a particular bioinformatics problem or class of problems, starting with the simplest and increasing in complexity as the book progresses. Each chapter includes programming exercises and teaches bioinformatics by showing and modifying programs that deal with various kinds of practical biological problems. By the end of the book you'll have a solid understanding of Perl basics, a collection of programs for such tasks as parsing BLAST and GenBank, and the skills to take on more advanced bioinformatics programming. Some of the later chapters focus in greater detail on specific bioinformatics topics. This book is suitable for use as a classroom textbook, for self-study, and as a reference.

The book covers:

  • Programming basics and working with DNA sequences and strings
  • Debugging your code
  • Simulating gene mutations using random number generators
  • Regular expressions and finding motifs in data
  • Arrays, hashes, and relational databases
  • Regular expressions and restriction maps
  • Using Perl to parse PDB records, annotations in GenBank, and BLAST outp

Table of Contents


1. Biology and Computer Science
     The Organization of DNA
     The Organization of Proteins
     In Silico
     Limits to Computation

2. Getting Started with Perl
     A Low and Long Learning Curve
     Perl's Benefits
     Installing Perl on Your Computer
     How to Run Perl Programs
     Text Editors
     Finding Help

3. The Art of Programming
     Individual Approaches to Programming
     Edit-Run-Revise (and Save)
     An Environment of Programs
     Programming Strategies
     The Programming Process

4. Sequences and Strings
     Representing Sequence Data
     A Program to Store a DNA Sequence
     Concatenating DNA Fragments
     Transcription: DNA to RNA
     Using the Perl Documentation
     Calculating the Reverse Complement in Perl
     Proteins, Files, and Arrays
     Reading Proteins in Files
     Scalar and List Context

5. Motifs and Loops
     Flow Control
     Code Layout
     Finding Motifs
     Counting Nucleotides
     Exploding Strings into Arrays
     Operating on Strings
     Writing to Files

6. Subroutines and Bugs
     Scoping and Subroutines
     Command-Line Arguments and Arrays
     Passing Data to Subroutines
     Modules and Libraries of Subroutines
     Fixing Bugs in Your Code

7. Mutations and Randomization
     Random Number Generators
     A Program Using Randomization
     A Program to Simulate DNA Mutation
     Generating Random DNA
     Analyzing DNA

8. The Genetic Code
     Data Structures and Algorithms for Biology
     The Genetic Code
     Translating DNA into Proteins
     Reading DNA from Files in FASTA Format
     Reading Frames

9. Restriction Maps and Regular Expressions
     Regular Expressions
     Restriction Maps and Restriction Enzymes
     Perl Operations

10. GenBank
     GenBank Files
     GenBank Libraries
     Separating Sequence and Annotation
     Parsing Annotations
     Indexing GenBank with DBM

11. Protein Data Bank
     Files and Folders
     PDB Files
     Parsing PDB Files
     Controlling Other Programs

     Obtaining BLAST
     String Matching and Homology
     BLAST Output Files
     Parsing BLAST Output
     Presenting Data

13. Further Topics
     The Art of Program Design
     Web Programming
     Algorithms and Sequence Alignment
     Object-Oriented Programming
     Perl Modules
     Complex Data Structures
     Relational Databases
     Microarrays and XML
     Graphics Programming
     Modeling Networks
     DNA Computers

A. Resources

B. Perl Summary





- 編程基礎和處理DNA序列和字符串
- 調試代碼
- 使用隨機數生成器模擬基因突變
- 正則表達式和在數據中查找模式
- 數組、哈希和關聯數據庫
- 正則表達式和限制圖譜
- 使用Perl解析PDB記錄、GenBank中的註釋和BLAST輸出

- 前言
- 第1章 生物學和計算機科學
- 第2章 開始使用Perl
- 第3章 編程的藝術
- 第4章 序列和字符串
- 第5章 模式和循環