The Short Read Archive (SRA) now contains more than 2 petabases of high-throughput sequence data. One petabase of data is open access, while the rest are sequences from 40,000 individuals who have participated in human clinical studies catalogued in dbGaP.