Note-3 Mapping
This contents are mainly notes from galaxy trainning: Mapping
Workflow
FASTQ (raw sequencing reads)
→ Bowtie 2 (mapping/alignment)
→ SAM/BAM file (The file output from Bowtie 2 tool) → IGV visualization (Using IGV tool to read SAM/BAM file)
What is mapping
Sequencing produces a collection of sequences without genomic context.
- What is Sequences data? FASTQ/FAST files
We do not know to which part of the genome the sequences correspond to.
Mapping the reads of an experiment to a reference genome is a key step in modern genomic data analysis.
With the mapping the reads are assigned to a specific location in the genome and insights like the expression level of genes can be gained.
Using BLAST analysis to figure out where the sequenced pieces fit best in the known genome.
- what is [[BLAST]] analysis, how its work
- BLAST = Basic Local Alignment Search Tool
- Doing Local Alignment
- BLAST = Basic Local Alignment Search Tool
- what is [[BLAST]] analysis, how its work
What is reference genome
- A reference genome is a standardized representative DNA sequence used as a baseline for studying a species.
- It is like a map/template of an organism’s genome that researchers compare new sequencing data against. ## What tools to use
Bowtie2
Bowtie 2 is a sequence aligner / mapper.
Tool that out put BAM file
Its job is to take sequencing reads (usually FASTQ) and determine:
- where each read came from in the reference genome
- best matching genomic location
- orientation of read
- mismatches / gaps allowed
- where each read came from in the reference genome
Used commonly for:
- DNA-seq
- genomic resequencing
- ChIP-seq
- ATAC-seq
- some RNA workflows (though splice-aware aligners preferred for RNA-seq)
- DNA-seq
Primary output:
- SAM file (text format alignment)
- often converted to BAM file (binary compressed version) ### IGV
- SAM file (text format alignment)
IGV = Integrative Genomics Viewer
Genome browser / visualization software.
Used to visually inspect alignment data.
How to read IGV
- IGV | Sequencing Data Basics
- Really good official tutorial about basic IGV visualization signs and meaning.
- Such as:
- Gray and white read represent different mapping confidence.
- Insertion/deletion visualization icon in IGV
- Nucleotide highlights showing differences compared with the reference sequence.
How to download IGV
- IGV official website
- Mac using homebrew
brew install --cask igv-desktopCitation
- Joachim Wolff, Bérénice Batut, Helena Rasche, Mapping (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/sequence-analysis/tutorials/mapping/tutorial.html Online; accessed Sat Apr 25 2026
- Hiltemann, Saskia, Rasche, Helena et al., 2023 Galaxy Training: A Powerful Framework for Teaching! PLOS Computational Biology 10.1371/journal.pcbi.1010752
- Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012