A central problem in ecological and evolutionary genetics is to identify the genomic basis of phenotypic variation in natural populations. Addressing this question provides basic knowledge of biological processes, but is also of significant practical importance as lineages of organisms that cannot move to new habitats will either adapt to changing environmental conditions, or go extinct. Traditional genetic markers allow the estimation of some population genetic parameters, such as average heterozygosity or neutral population structure. However, genome-scale questions, such as identifying specific loci that are under selection or responsible for ecologically relevant phenotypic variation, have been more difficult to address. A significant impediment has been the great expense of developing the density of genetic markers required for these types of research projects in each new study organism. The advent of next generation sequencing approaches has lifted this impediment, but optimal tools and best practices are still being developed.
In this talk I will describe our work in developing Illumina sequenced Restriction site Associated DNA (RAD) markers as a tool for the genomic analysis of non-model organisms. This approach allows researchers to simultaneously identify and type tens of thousands of single nucleotide polymorphisms (SNPs) in nearly any organism, and can be used for population genetic and phylogeographic studies as well as functional studies of allele-specific gene expression. The rapid evolution of threespine stickleback fish in new environments provides a good case study. Using sequenced RAD tags, we have found parallel patterns of divergence across the genomes of multiple stickleback populations that have adapted to similar environmental conditions, we have identified candidate genes underlying phenotypic variation in single populations, and we have found changes in gene expression in different environmental conditions. The plethora of new genetic data has introduced a computational problem, and I will also describe our progress in developing publicly available software pipelines to distill the flood of next generation sequencing data into usable information. I will use these results as a starting point to discuss the general usefulness - and significant remaining hurdles - of RADseq and other next generation sequencing tools for various ecological, evolutionary and conservation genomic studies in non-model organisms.