2018 - Bioinformatics Tutorial - Advanced (2018)
  • Bioinformatics Tutorial - Advanced (2018)
  • Getting Startted
  • PART I Basic Skills
    • Introduction of PART I
    • 1.Setup
    • 2.Linux
    • 3.Bash and Github
    • 4.R
    • 5.Python
    • 6.Perl
    • Conclusion of PART I
  • PART II. Basic Bioinfo Analyses
    • Introduction of PART II
    • 1.Mapping, Annotation and QC
    • 2.Expression Matrix
    • 3.Differential Expression
    • Midterm Conclusion
    • 4.Normalization
    • 5.Control Data
    • 6.Motif and Structure
  • PART III. Advanced Bioinfo Analyses
    • Introduction of PART III
    • 1.Machine Learning
    • 2.Feature Selection
    • 3.Deep Learning
  • Appendix
    • Appendix I. Keep Learning
    • Appendix II. Docker Manual
    • Appendix III. Mapping Protocol
Powered by GitBook
On this page
  • Major Control Data
  • 1. ENCODE (cell lines)
  • 2. CCLE (cancer cell lines)
  • 3. TCGA (tissue)
  • 4. 1000 Genomes
  • Online Databases
  • Major Data Central
  • Expression Data
  • Video
  • a) Imputation and confounders

Was this helpful?

  1. PART II. Basic Bioinfo Analyses

5.Control Data

Previous4.NormalizationNext6.Motif and Structure

Last updated 5 years ago

Was this helpful?

Major Control Data

1. ENCODE (cell lines)

a comprehensive list of functional elements in the human genome.

Tier 1:

  • GM12878 is a lymphoblastoid cell line produced from the blood of a female donor with northern and western European ancestry by EBV transformation. It was one of the original HapMap cell lines and has been selected by the International HapMap Project for deep sequencing using the Solexa/Illumina platform. This cell line has a relatively normal karyotype and grows well. Choice of this cell line offers potential synergy with the International HapMap Project and genetic variation studies. It represents the mesoderm cell lineage. Cells will be obtained from the [coriell.org] (Catalog ID GM12878).

  • K562 is an immortalized cell line produced from a female patient with chronic myelogenous leukemia (CML). It is a widely used model for cell biology, biochemistry, and erythropoiesis. It grows well, is transfectable, and represents the mesoderm linage. Cells will be obtained from the [atcc.org] (ATCC Number CCL-243).

  • H1 human embryonic stem cells will be obtained from [cellulardynamics.com].

Tier 2:

  • HeLa-S3

  • HepG2

  • HUVEC

Tier 2.5

  • SK-N-SH

  • IMR90 (ATCC CCL-186)

  • A549 (ATCC CCL-185)

  • MCF7 (ATCC HTB-22)

  • HMEC or LHCM

  • CD14+

  • CD20+

  • Primary heart or liver cells

  • Differentiated H1 cells

2. CCLE (cancer cell lines)

3. TCGA (tissue)

The Pan-Cancer Atlas

From the analysis of over 11,000 tumors from 33 of the most prevalent forms of cancer, the Pan-Cancer Atlas provides a uniquely comprehensive, in-depth, and interconnected understanding of how, where, and why tumors arise in humans. As a singular and unified point of reference, the Pan-Cancer Atlas is an essential resource for the development of new treatments in the pursuit of precision medicine.

4. 1000 Genomes

find most genetic variants with frequencies of at least 1% in the populations studied.

Online Databases

Major Data Central

1. UCSC

2. Ensemble

3. NCBI

Expression Data

4. GTEx

5. Expression Atlas

6. BioPortal

7. TCGA GEPIA

8. TCGA ncRNA

Video

a) Imputation and confounders

Figure 1. Cellular Localization of different types of RNAs, Ref.:

genome browser for vertebrate.

genome annotation.

contribute to the NIH mission of ‘uncovering new knowledge’.

gene expression in different tissues.

exploring gene expression results across species under different biological conditions.

visualization, analysis and download of large-scale cancer genomics data sets.

gene expression in different TCGA tumor types.

https://www.encodeproject.org/
Cell types of ENCODE
Coriell Institute for Medical Research
America Type Culture Collection (ATCC)
Cellular Dynamics International
https://www.nature.com/articles/nature11233
https://portal.gdc.cancer.gov/
http://www.cell.com/pb-assets/consortium/pancanceratlas/pancani3/index.html
http://www.internationalgenome.org/
https://genome.ucsc.edu/
http://www.ensembl.org/index.html
https://www.ncbi.nlm.nih.gov/
https://www.gtexportal.org/home/
https://www.ebi.ac.uk/gxa/home
http://www.cbioportal.org/index.do
http://gepia.cancer-pku.cn/index.html
http://ibl.mdanderson.org/tanric/_design/basic/index.html
@Youtube
@Bilibili