Note-3 Mapping

This contents are mainly notes from galaxy trainning: Mapping

Workflow

FASTQ (raw sequencing reads)
→ Bowtie 2 (mapping/alignment)
→ SAM/BAM file (The file output from Bowtie 2 tool) → IGV visualization (Using IGV tool to read SAM/BAM file)

What is mapping

  • Sequencing produces a collection of sequences without genomic context.

    • What is Sequences data? FASTQ/FAST files
  • We do not know to which part of the genome the sequences correspond to.

  • Mapping the reads of an experiment to a reference genome is a key step in modern genomic data analysis.

  • With the mapping the reads are assigned to a specific location in the genome and insights like the expression level of genes can be gained.

  • Using BLAST analysis to figure out where the sequenced pieces fit best in the known genome.

    • what is [[BLAST]] analysis, how its work
      • BLAST = Basic Local Alignment Search Tool
        • Doing Local Alignment
  • What is reference genome

    • A reference genome is a standardized representative DNA sequence used as a baseline for studying a species.
    • It is like a map/template of an organism’s genome that researchers compare new sequencing data against. ## What tools to use

Bowtie2

  • Bowtie 2 is a sequence aligner / mapper.

  • Tool that out put BAM file

  • Its job is to take sequencing reads (usually FASTQ) and determine:

    • where each read came from in the reference genome
    • best matching genomic location
    • orientation of read
    • mismatches / gaps allowed
  • Used commonly for:

    • DNA-seq
    • genomic resequencing
    • ChIP-seq
    • ATAC-seq
    • some RNA workflows (though splice-aware aligners preferred for RNA-seq)
  • Primary output:

    • SAM file (text format alignment)
    • often converted to BAM file (binary compressed version) ### IGV
  • IGV = Integrative Genomics Viewer

  • Genome browser / visualization software.

  • Used to visually inspect alignment data.

How to read IGV

  • IGV | Sequencing Data Basics
    • Really good official tutorial about basic IGV visualization signs and meaning.
    • Such as:
      • Gray and white read represent different mapping confidence.
      • Insertion/deletion visualization icon in IGV
      • Nucleotide highlights showing differences compared with the reference sequence.

How to download IGV

brew install --cask igv-desktop

Citation