Integrative Genome Viewer

Info about link between tool and IGV

http://software.broadinstitute.org/software/igv/

The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.

IGV from web-tool

There are two ways to launch IGV from the webtool:

Complete region

Clicking the main Launch IGV button (1) will launch IGV, visualizing the specified region (2).
Variation

Clicking the local Launch IGV button (1) will launch IGV, visualizing the variation (2).

Before IGV launches, a pre-launch-dialog will be shown. This dialog allows the user to select whether to launch a new IGV-session (1) or jump to the specified location in an existing IGV-session.

The pre-launch-dialog also enables specifying which tracks IGV should load by default.
Clicking Load Track Options (3) will display all available tracks.
Each track can be enabled/disabled by checking/un-checking the checkbox of the corresponding track.

The IGV launch can be aborted by clicking Cancel (4)

IGV is dependent on JAVA. To install JAVA go to the JAVA-download-page by clicking JAVA (5)

IGV from PC

Download ( http://www.broadinstitute.org/software/igv/download ) and open IGV
Change the data server:
Click View > Preferences > Advanced > highlight Edit server properties

In Data Registry URL, paste in http://bioinformatics.psb.ugent.be/downloads/genomeview/hek293/data.txt
Set the default genome to ‘Human hg18’ (upper left field in the browser window)
Load the data of interest by clicking File > Load from Server and select your dataset of choice.

IGV 293 data tracks

We have organized the data in a series of structured tracks that can be loaded individually or in combination, allowing the user to inspect the dataset of interest for his/her favorite gene or gene region. A short description of the contents of each track or set of tracks is given below.

Single Nucleotide Polymorphism (Complete Genomics sequencing)

CG 293, CG 293S, CG 293SG, CG 293SGGD, CG 293FTM, CG 293T

RTG 293, RTG 293S, RTG 293SG, RTG 293SGGD, RTG 293FTM, RTG 293T

These tracks represent the results of the different SNP callers (CG and RTG) in a vertical bar visual, and are best viewed zoomed in to a few kb or less, depending on the density of SNPs in that region. The tracks are also best displayed as ‘collapsed’ (right-click on the track in the name panel left, choose Display mode > Collapsed). SNPs or indels compared to the human reference genome are annotated by color as homozygous (red) or heterozygous (red and blue); no-calls (CG algorithm) are shaded. Note that in the expanded view, colours are different in the lower half of each track: homozygous calls are cyan, heterozygous ones are dark blue and no-calls (CG algorithm) are white. Furthermore, in the RTG track, positions with low (<5) quality SNP scores are indicated in grey. Hover over the bars for additional information on the SNP.

Gene expression profiles (Affymetrix exon array)

The data from the expression arrays after processing (both exon-level and gene-level) can also be consulted via IGV. These tracks are best viewed within a range of a few Mb and smaller. Mind that the data range is adjusted automatically to fit the window; it is therefore indicated to adapt the data range to the same value when comparing different tracks (right-click on the row of interest in the left most column > Set Data Range…)

Differentially expressed genes between cell lines (locus based)

This track refers to the pairwise comparisons of differentially expressed genes. It thus allows visualization of every gene that has been detected as significant (p<0.01) in the comparison of interest, starting from the filtered and noise-removed dataset. Additionally, information about the associated fold-change is included as a function of bar height.

Mean probeset expression

Mean probeset expression (extensive filtering)

Mean probeset expression (extensive filtering and noise removed)

Two tracks allude to the exon-level data. Both provide information on the background corrected, normalized and summarized signal intensities for the exon-level extended probesets, after filtering for probes undetected in all lines, as well as for cross-hybridizing probesets. Moreover, by providing the noise-removed datasets in an additional track, we offer the possibility to look at the data both before and after removal of probes that we regarded as noisy (average signal intensity value lower than 7 in all lines). Thus, it is up to the user to decide which dataset is more relevant for his/her work.

Web link to gene exp

This IGV track maps the differentially expressed genes based on the Affymetrix transcript cluster annotations. It provides a link (double-click on the bar) to a summary of the gene-level statistical data, including (per pairwise comparison) raw and adjusted p-value, t-statistic and log2 fold change. For the sake of clarity and completeness, this includes the loci that were categorized as too noisy for manual inspection.

Short-reads Alignment

Complete Genomics local realignment

Realign/293, Realign/293S, Realign/293SG, Realign/293SGGD, Realign/293FTM, Realign/293T

The realignment tracks depict the reads (grey horizontal bars, lower part of the track) that have been remapped during the realignment process, and their coverage of the realignment region (upper part of the track). Consequently, the white regions in this track are not necessarily regions without coverage, but more likely regions where no anomalies (SNPs or indels, for instance) were detected during the raw alignment to the reference human genome. Sequence variations in the individual reads are shown as well. It can be useful to combine this track with the SNP/indel tracks, e.g. to manually inspect the data underlying a particular SNP caller result. The data here is best viewed at high magnification (a few 100 bp or less).

Complete Genomics coverage plot

Coverage/293, Coverage/293S, Coverage/293SG, Coverage/293SGGD, Coverage/293FTM, Coverage/293T

Plots out the coverage as determined during the raw alignment. This track can be interesting to get an idea of how strongly the data supports a particular SNP call.

HEK293A Illumina mate-pair sequencing

Tracks involving the Illumina mate-pair sequencing data of the HEK293A cell line. The different tracks represent the data as processed with different read alignment software, correspondingly BWA (Burrows-Wheels Aligner) or RTG, and the latter both for mated and unmated reads. The color of the reads corresponds with the chromosome it aligns to, meaning that a read with a color that deviates from the bulk neighbouring reads can also be aligned to another chromosome. As with the CG realignment data, it is best to zoom in to a few 100 bp or less.

Copy Number Variation (CNV)

Complete Genomics CNV by HMM algorithm/in 2kb window size

HMM/293, HMM/293S, HMM/293SG, HMM/293SGGD, HMM/293FTM, HMM/293T

2KB/293, 2KB/293S, 2KB/293SG, 2KB/293SGGD, 2KB/293FTM, 2KB/293T

These tracks represent copy number variation across the genome of the various cell lines and are best viewed in a window of a few Mb. The data is based on the CompleteGenomics CNV pipeline 1.11 in both tracks (thereby based on sequence coverage), but is represented in different ways. For the CNV 2KB track, the copy number was binned in 2 kb windows and is represented as a bar chart. For the CNV HMM track the data is represented as a color-coded horizontal bar by means of a Hidden Markov Model: green indicates regions with a higher copy number than average for that genome, red a lower copy number. Note that while the copy number is ordinarily normalized assuming diploidy (2n), here the data was calibrated to the Illumina SNP array average copy number per chromosome as an independent reference for ploidy.

CNV based on Illumina SNP array

293, 293S, 293SG, 293SGGD, 293FTM, 293T

Copy number variation across the genomes as determined with the Illumina SNP arrays, by allele.

Structure Variation

293, 293S, 293SG, 293SGGD, 293FTM, 293T, NA19238

293 (subtracted with NA19238), 293S (subtracted with A), 293SG (subtracted with A), 293SGGD (subtracted with A), 293FTM (subtracted with A), 293T (subtracted with A), NA19238 (subtracted with A)

The structure variation tracks contain the data from the ‘junction sequence contigs’, thereby indicating breakpoints involved in chromosomal rearrangements. Hover over each breakpoint in this track for more detailed information on the nature of the rearrangement, as well as their exact position, length, genes involved, and more. The user has the option to load the tracks with all structural variants, or the new variants found when compared with another genome (either the parental HEK293A genome, or the reference NA19238).

Public Data

Broad public RNAi

Track representing the position targeted by the shRNAs from the Broad Institute’s TRC2 collection (distributed by Sigma). The availability of the HEK genome sequence should now allow users to predict which shRNA clones are more likely to work in these HEK293 cell lines.

Public CG data

69 cell lines

The ‘69 cell lines’ track is a mappability track. It compiles Complete Genomics sequencing data from 69 genomes, thereby allowing identification of systematic absence of coverage. A value of 0 here means that no read mapping could be obtained for any of the samples, while a value of 69 would mean that there was read support for all samples. Therefore, gaps in this track are indicative for genome or platform-related biases, and can help to avoid overinterpretation of sequencing results.

Hg18 GC% 5 bases

Here the GC% per 5 bases is plotted out along the sequence. This track can be useful to pinpoint GC-rich areas, which might be more prone to mapping issues.

Other tracks

The other tracks represent public CG sequencing data from two Central-European trios in two different ways. The first one, avgNormalizedCvg depicts the sequencing coverage normalized by averaging the coverage over 2 kb windows, whereas the second, gcCorrectedCvg, reflects a GC%-corrected coverage calculation (with 1 kb window). Just like the `69 cell lines` track, it allows comparison of personal data with public data for the identification of biases or systematic errors.

Notes on the use of tracks in IGV

We do not advise loading all datatracks at once – it might take some time (depending on your machine) and the browser content cannot be examined efficiently in this way. Instead, load only those tracks relevant for your particular question. As mentioned above, each track can also be displayed in 3 view modes: expanded, squished or collapsed. Tracks can be removed by right-clicking on the name panel on the left, selecting the option “Remove Track”. Similarly, the data range can be adapted (often necessary when viewing the 2kb CNV tracks).