A note on open genomics

A few months ago I purchased a decent desktop just to crunch ADMIXTURE and other packages to analyze genomic data. More recently I set up a ~100 GB Dropbox account, and have started to “push” all of my output files from ADMIXTURE, PLINK, etc., as well as various scripts (Perl, shell, R, etc.) into the public folder (more precisely, a script is running ADMIXTURE and moving the files into the appropriate Dropbox folders as I type this, and Dropbox syncs with the online folders). I’m doing this for two reasons.

First, I want to make the pipeline of data generation easier for me. Instead of running ADMIXTURE, and then processing the files laboriously with R to generate plots, I’ve now created a system where a few automated scripts begin ADMIXTURE runs, and then another script creates files for distruct, and runs distruct, and then trims the images output and converts them into PNGs. This should allow me to resurrect my side projects, even while I’m rather busy with the “main events” of my life.

Second, I am beginning to feel that the promise of the “genome blogging revolution” kind of faded out. Granted, there’s only so much you can do with the same data sets, so I’m going to try and put together large pedigree files in my Dropbox account. But it seems like people need more of a push. Toward that end I hope that distribution of scripts which make the process more “turnkey” will stimulate people going forward.

Addendum: I know that some of the first paragraph is going to be gibberish to some readers. But I hope you’ll appreciate the outcomes of that gibberish!

Source: Discover Magazine – Gene Expression