2.Expression Matrix

Pipeline

Data Structure

Inputs

File format

Information contained in file

File description

Notes

bam

alignments

Produced by mapping reads to the transcriptome.

Reads are trimmed using a proprietary version of cutAdapt. We map to transcriptome for a better sensitivity (see details in protocol and example).

Outputs

File format

Information contained in file

File description

Notes

bigWig

signal

Normalized RNA-seq signal

Signals are generated for transcriptome both the plus and minus strands and for unique reads and unique+multimapping reads.

tsv

gene (ncRNA) quantifications

Non-normalized counts.

Running Scripts

Software/Tools

  • RSEM

  • homer

  • HTseq

  • FeatureCounts

Example of single case

Example of batch job

Tips/Utilities

Merge multiple bams to one

Homework and more

  1. Visualize your mapped reads with IGV (locally) and/or UCSC Genome Browser (on line).

  2. Learn how to construct the expression matrix using HTSeq, featureCounts and homer; then compare the difference among these three methods.

More Reading and Practice

  • Additional Tutorial : 2. Construction of expression matrix

  • Bioinformatics Data Skills

    • 11) Working with Alignment Data

Video

a) Expression matrix

@Youtube

@Bilibili

Last updated

Was this helpful?