Background

Small RNA sequencing (RNA-Seq) is a technique to isolate and sequence small RNA species, such as microRNAs (miRNAs), short interfering RNA (siRNA), piwi-interacting RNA (piRNA), and more. Small RNA-Seq can query thousands of small RNA and miRNA sequences with unprecedented sensitivity and dynamic range. With small RNA-Seq you can discover novel miRNAs and other small noncoding RNAs, and examine the differential expression of all small RNAs in any sample.

Figure: smallRNA-Seq workflow


1. Raw data statistics

Once you provide raw data, then data stats will be provided, including number of reads, genome coverage (x) and base distribution.


2. Quality check

Quality control checks on raw sequence data coming from high throughput sequencing provides a quick impression of whether your data has any problems of which you should be aware before doing any further analysis. FastQC will be used to study the quality of data provided. Usually we will test for adapter contamination, read quality and other sequencing biases.


3. Data pre-processing

Data pre-processing is very important to process over-represented sequence and low quality reads as they may interfere with alignment and eventually with the gene expressions.

Based on the quality of data:

  • 1. Remove the adapters/over-represented sequences from RNA seq data using cutadapt by providing adapters used while sequencing.
  • 2. Quality/end trimming will improve overall quality of each reads; trimmomatic can be used for this step.


4. Alignment

A sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. In referenced based small RNA-seq, reads are aligned to reference genome using bowtie.


5. Gene expression analysis

Small RNA expression analysis studies can provide a snapshot of actively expressed non-coding RNA under various conditions. Read count will be measured using HTSeq/featureCount. DESeq and EdgeR are two algorithms to study differential expression.


6. Target prediction

Relation between small RNAs and messenger RNA will be studied in this part. TargetScan and PicTar provides microRNA target predictions based on sequence complementarity to target sites with emphasis on perfect base-pairing in the seed region and sequence conservation.


7. Functional analysis

Identified differentially expressed small RNAs will be annotated for gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG).


8. Deliverables

Brief report including raw data stat, alignment summary and key findings will be provided along with TAB delimited files of identified small RNA, diff expressed small RNA along with GO and KEGG annotations and predicted small RNA targets will be provided as final results.

Upon request, we can also share intermediate files or any other information and any additional analysis if required.