Learn the data skills necessary for turning large sequencing datasets into reproducible and robust biological findings. With this practical guide, you ll learn how to use freely available open source tools to extract meaning from large complex biological data sets.
At no other point in human history has our ability to understand life 's complexities been so dependent on our skills to work with and analyze data. This intermediate-level book teaches the general computational and data skills you need to analyze biological data. If you have experience with a scripting language like Python, you re ready to get started.
Go from handling small problems with messy scripts to tackling large problems with clever methods and tools
Process bioinformatics data with powerful Unix pipelines and data tools
Learn how to use exploratory data analysis techniques in the R language
Use efficient methods to work with genomic range data and range operations
Work with common genomics data file formats like FASTA, FASTQ, SAM, and BAM
Manage your bioinformatics project with the Git version control system
Tackle tedious data processing tasks with with Bash scripts and Makefiles
About the Author
Vince Buffalo is currently a first-year graduate student studying population genetics in Graham Coop's lab at UC Davis in the Population Biology Graduate Group. Before starting his PhD in population genetics, Vince worked professionally as a bioinformatician in the Bioinformatics Core at the UC Davis Genome Center and in the Department of Plant Sciences. An obsessive programmer since he was a young teenager, Vince was drawn to the statistical and computational problems of genomics. He works on open source bioinformatics tools in his work and free time, and enjoys fly fishing and cooking when away from the computer.