RNA-Seq Analysis for Rice

thumbnail for this post

I performed data analysis for a research paper wherein we were interested in studying time series RNA-Seq data for rice. My fellow developers and I created a data analysis pipeline for the RNA-Seq data wherein we identified differentially expressed genes (DEGs) and described their gene ontology (cellular component, molecular function, biological process).

I developed a short C++ parser to generate a csv from the GO for plants that maps GO numbers with their respective term.

We used the DESeq2 R package developed by Love et al.

The gene that was knocked out is believed to play a role in rice germination via controlling the release of amylases in starche digestion for the rice embryo.

Some of the figures that were prepared include:

  • Cross correlation to identify genes will similar expression patterns
  • PCA for time point, and treatment group (wild type and mutant)
  • Upset plot for summarizing the number of upreguled and down regulated DEGs across 6 different time points
  • Heat map for summarizing the most significant gene ontology terms for DEGs in a given time point
  • Mirrored bar plot for summarizing the number of upregulated/downregulated DEGs

Photo by Sergio Camalich https://unsplash.com/photos/green-wheat-in-macro-shot-4Fzp6z40Jhs