For a text report of the genotypes in the region you are browsing:
- Select "Download SNP genotype data" under "Reports & Analysis".

- Press "Configure..."
- Select a population from the drop-down menu and the format to display your results.
The underlying file format is currently space-delimited text tables. For details on the format, see the genotype download directory README.
Note: If you install Haploview, the output file can be loaded into this Java application for further data analysis.
For a text report of HapMap allele or genotype frequency data, proceed as above, but select "Download SNP Allele Frequency Data" or "Download SNP Genotype Frequency Data" instead.

The underlying file format is currently space-delimited text tables. For details on the format, see the frequency download directory README.
Linkage disequilibrium (LD) refers to the phenomenon that alleles that are close together in the genome tend to be inherited together.
The HapMap GBrowser allows you to display as separate tracks, three LD measures for a given set of markers: D', r2, and LOD score. For example, you may want to compare the LD patterns corresponding to different populations or limit the number of SNPs displayed to those meeting certain criteria. For this, you can select different colors to represent different LD measures (e.g., red for D', and blue for r2), set up thresholds for LD measures (e.g., D'=1 or r2 > 0.8), and invert the orientation of the plot for different populations (e.g., normal for YRI and inverted for CEU).
- To view LD patterns, select the plugin: LD Plot track.

- To configure the LD tracks, select the "Annotate LD Plot" option from the Results & Analysis drop down list, and click the "Configure" button.
- In the configuration page (see image below), set up the parameters for each LD property, such as color of the pairwise plot, threshold for SNPs to be displayed, populations to be displayed, etc. Below is a description of each track and configuration options.

- For each HapMap data release the LD data is calculated for markers up to 250 kb. Precalculated LD values can be downloaded by selecting the "Download HapMap LD Data" option from the Results & Analysis drop down list. Click the "Configure" button and "Go".
LD Plots
- D' plot:
This is the default plot and is displayed when the user turns on the LD plot track. This track plots the raw D' score for a given marker pair. D' is a measure of linkage disequilibrium between two genetic markers. A value of D' = 1 (complete LD) indicates that two SNPs have not been separated by recombination, while values of D' < 1 (incomplete LD) indicate that the ancestral LD was disrupted during the history of the population. Only D' values near one are a reliable measure of LD extent; lower D' values are usually difficult to interpret as the magnitude of D' strongly depends on sample size. To calculate this value, see the article by Lewontin (1988).
- r2plot:
This track plots the raw r2 score for a given marker pair. The r2 is a measure of linkage disequilibrium between two genetic markers. For SNPs that have not been separated by recombination or have the same allele frequencies (perfect LD), r2 = 1. In such case, the SNPs are said to be redundant. Lower r2 values indicate less degree of LD. One useful property of r2 for association studies is that its inverse value, 1/r2, provides a practical estimate of the magnitude by which the sample size must be increased in a study design to detect association between the disease and a marker locus, when compared with the size required for detecting association with the susceptibility locus itself. To calculate this value, see the article by Pritchard and Przeworski (2001).
- LOD score plot:
This track plots the logarithm of the odds (LOD score) for linkage disequilibrium between a given marker pair.
Configuring LD Plots
The LD plot plugin generates a pairwise plot of marker-to-marker LD values, where the genotyped SNP are denoted as ticks and the marker pairwise information is plotted as boxes between these ticks.
These are some parameters you may change in the configuration of an LD plot:
- Color: The intensity of the box color is proportional to the strength of the LD property for the marker pair. The default color scheme is similar to the Haploview standard color scheme (see tables below).
| Standard LD Color Scheme |
| D' < 1 | D' = 1 |
| LOD < 2 | white | blue |
| LOD > = 2 | shades of pink/red | bright red |
| r2 Color Scheme |
| r2 = 0 | white |
| 0 < r2 < 1 | shades of grey |
| r2 = 1 | black |
| Alternate D'/LOD Color Scheme |
| Low D' | High D' |
| Low LOD | white | shades of pink |
| High LOD | white | black |
- Segment Size: The maximum segment size to view the LD plot can be adjusted. The default is set to 250 kb.
- Number of SNPs: The number of SNPs to be displayed in a segment can be adjusted. It is useful to set a lower number in segments which are densly genotyped, such as an ENCODE region. This option can be used to restrict the number of SNPs to be displayed so the images will be restored faster. For a 250 kb segment the default is set to 200 SNPs.
- Box Size: There are two options available, Proportionate and Uniform. The Proportionate box size depends on the physical distance between the adjacent marker pairs. SNPs close to each other will have smaller boxes and SNP pairs farther apart will have a larger box. So a given region will have a plot with boxes of varying sizes which represents the actual distance between the markers. The plot with proportionate boxes has a "filled in" look and regions with low density of markers will have larger boxes. Uniform box size are all of the same size but the placement of the boxes introduces gaps where the markers are farther apart. The size of the box depends on the marker density for a given segment and the distribution of markers therein. The plot with uniform boxes has a "dispersed" look to it. You can set up upon this option as per your preference.
- LD Properties: dprime referes to a D' plot, rsquare referes to an r2 plot, and lod refers to a LOD plot (see above for an explanation of each plot). First select an LD plot, then set its corresponding properties (i.e., LD thresholds, color of the pairwise plot, and orientation for a given population).
- LD thresholds (greater than/less than): Set the thresholds for the selected LD measure.
- Color of the Pairwise Plot: Set the color for the D' and r2 plots. Please note that our current restriction to LD values <= 1 will make the LOD plot appear in the white/blue scale (see LOD <2 in the above color table).
- Populations: Turn on one or more populations to plot. The standard abbreviations for the populations are used. For more information about the population please refer to Guidelines for Referring to HapMap Populations, Publications and Presentations.
- Orientation: The pairwise plot can be inverted which might help in comparing the LD plots between two populations. The orientation can be changed for all the plots.
Note: LD data is generated from the Haploview software's .DPRIME and .CHECK output files. For each HapMap data release the LD data is calculated for markers up to 500 kb. For calculating the LD data a non-redundant set of genotyped SNPs are used. For each chromosome the non redundant set is created by picking the first genotype SNPs which has a MAF > 0. The Genotyped SNP track shows the complete set of markers genotyped (including duplicate SNPs genotyped in different platforms by different centers). For this reason, you might not find a one-to-one match between the markers in the two tracks.
Phased haplotypes were generated using the program PHASE version 2.0 (see the article by Stephens and Donnelly 2003). During phasing, each allele in a genotype is assigned to one or the other parental chromosome, using a maximum likelihood algorithm that uses trio (lineage) information in the HapMap population groups, or, if trio information is not available, by fitting the data to a model that minimizes the number of implied historical crossovers in the population. The phased haplotypes are displayed as a graphic in which each chromosome of the individuals sampled by the project is represented as a line one pixel high and each SNP allele is arbitrarily colored blue or yellow. A region of high LD will appear as a region in which there are long runs of SNPs whose alleles are the same color, indicating that there is little recombination among them. A region of low LD will appear as an area where the runs are shorter and more fragmentary.
- To view phased haplotypes, select the plugin: Phased Haplotype Display track.
- Select the "Annotate Phased Haplotype Display" option from the Results & Analysis drop down list, and click the "Configure" button.
- Select the population(s) for which to display haplotype information, and click the "Configure" button to return to the main display.

The phased haplotypes for each population selected will appear in a separate track following the two-color scheme described earlier (see image below, track CEU Phased Haplotypes). The order of chromosomes is determined by a fast hierarchical clustering methodology, which places chromosomes that share similar haplotypes together.
- To retrieve the detailed phased genotypes, click on the track of the desired population. This will take you to a page that provides the haplotype information in tabular form, where each row represents each chromosome, and each column is an individual SNP. The background of each table entry is set to a color corresponding to what is seen in the graphical track.
The advantage of this display over the pairwise LD triangle display described earlier is that it is more compact and therefore better suited for the display of large regions. This makes it easy to correlate the position of long common haplotypes with SNPs chosen by the tag-SNP picker. The disadvantage of this display is that it conceals much of the fine-structure of LD in the region, in particular strong linkage among SNPs that are not adjacent to one another.

Tag-SNPs are a reduced set of SNPs that capture most of the genetic variation in a region, and can be used in association studies to reduce the number of SNPs needed to detect LD-based association between a trait of interest and a region of the genome.
There is no single set of tag-SNPs that will satisfy the diverse requirements of every association study design. For example, you may wish to select SNPs that work well with a particular genotyping system (such as those that have been included on a particular SNP chip) and may be willing to accept different tradeoffs between the cost of genotyping a study population and the strength of the association they can detect. For this reason, the HapMap website does not offer a static set of pre-selected tag-SNPs, but instead offers researchers a tool for interactively selecting tag-SNPs based on user-provided criteria. This tag-SNP picking tool is Tagger, an algorithm that chooses tag SNPs by formally maximizing the number of linked SNPs captured by a given set of tags.
- To choose tag-SNPs, select the plugin: tag SNP picker track.
- Select "Annotate tag SNP picker" under "Reports & Analysis", and press the "Configure" button.
Options for tag-SNP selection include selecting a population and a tagging method (pairwise or multi-marker), uploading a list of SNP IDs to be included in the set of tag-SNPs, uploading a list of SNP IDs to be excluded from the set of tag-SNPs, uploading a list of design scores (priorities) for each SNP, and selecting cutoffs for minimum acceptable LD value and allele frequency for SNPs to be included in the set. More information on these options is available in the Tagger website.

- After setting the desired options, click the 'Configure' button to run the analysis and return to the main display. Results are shown on a new feature track labeled tSNPs_TaggerMethod_population.

- To generate a text list of tag-SNPs, select "Download tag SNP data" under "Reports & Analysis", and press the "Configure" button and "Go".
The generated report contains a tab-delimited list of tag-SNP names, chromosome, position, and allele frequency in the region. This is followed by a section that lists the LD tests performed between all SNP pairs to select the tag-SNP, and based on the r2 cutoff set in the configuration page. The last section lists the non tag-SNPs that each tag-SNP captures.
The following table describes the symbols used at different magnifications and their meaning in the Generic Genome Browser. Note all alleles are with respect to the plus strand.

[Back to Generic Genome Browser]
| Last updated : gbrowse_help.html,v 1.8 2005/10/27 19:38:19 krishnan Exp |