HapMap Project logo
International HapMap Project
 

Home | About the Project | Data | Publications | Tutorial

中文 | English | Français | 日本 | Yoruba

Guidelines for the Responsible Use and Publication of HapMap data is available here.

Browse data graphically

Use the Generic Genome Browser to view HapMap Project data in the context of other genomic features, as well as retrieve genotypes & frequencies for specific genomic regions.

Generate reports and extracts of data using HapMart.

Jump directly to chromosome in the dataset.

Downloads

The following directories contain HapMap project related data, software, and documentation, that have been made publicly available. (See HapMap Data Access Policy for more information). More details about each dataset can be found in READMEs in the respective directories:

ENCODE regions

Ten ENCODE regions are being studied by HapMap centers. The work includes genotyping all dbSNP SNPs in each region, as well as resequencing in several samples and genotyping additional SNPs found.These regions were chosen by the Analysis Group and they include a range of chromosomes, recombination rates, gene density, and values of non-transcribed conservation with mouse. For more information about the ENCODE Project see HapMap ENCODE Page, special genotype data dumps are available here.

Release notes

HapMap data release #28, August 2010, on NCBI B36 assembly, dbSNP b126 --

This release contains genotypes and frequencies for non-redundant SNP assays from phases I+II 
and III of the HapMap project. For SNPs genotyped in both phases of the project, phase III SNP 
assays were preferentially selected, as per the larger number of HapMap samples included during 
this phase. 

Note for genotypes files: For SNPs with no data in phase III (ie, only included in phase I+II), 
"NN" was used as place holder.

Additional information may be found in the README document:
ftp://ftp.ncbi.nlm.nih.gov/hapmap/genotypes/2010-08_phaseII+III/00README.txt 

The following inconsistencies were detected after processing of the data in this release: 

* Known indels: 422 SNP assays with alleles herein coded as D and I
* Multiple positions: 23 SNPs with multiple assays mapping to different genomic locations
* Multiple alleles: 27 SNP assays with more than two alleles, one of which could be absent (deletion)

Complete lists of the above inconsistencies may be found in:
ftp://ftp.ncbi.nlm.nih.gov/hapmap/genotypes/2010-08_phaseII+III/inconsistencies/

Genotyped non-redundant QC+ SNPs:

chr	ASW	CEU	CHB	CHD	GIH	JPT	LWK	MEX	MKK	TSI	YRI
1	125129	314222	317642	104948	112989	317446	123617	117736	124198	113882	313459
2	126201	329906	330754	106442	114414	330598	124885	118709	125468	115314	323846
3	104551	259502	259578	88819	95111	259476	103552	98249	104014	95977	255437
4	94564	248255	248468	79886	86093	248378	93711	87927	94344	86988	243603
5	95480	250784	251326	82127	87776	251246	94425	90156	94999	88213	246571
6	99107	272384	276251	85961	91492	276026	97485	94123	98185	93015	270475
7	82125	216652	217377	70759	75929	217274	81302	78240	81835	76571	213177
8	82181	215935	218940	69472	75346	218875	81222	77085	81758	75834	215371
9	69207	183950	185335	59952	64358	185255	67875	65594	68453	64611	182775
10	79701	211778	214061	68325	73467	214000	78538	76347	78974	73846	210425
11	76716	207013	208147	65193	69820	208129	75704	71899	76057	70446	202587
12	74164	195231	196061	63505	68269	195994	73029	70686	73362	68198	193879
13	56991	157480	159596	48263	52257	159554	56282	53219	56899	52690	156961
14	49078	124751	125414	41997	45027	125351	48557	46462	48789	45441	122710
15	45565	108502	109028	39124	41443	108955	45183	42409	45177	41668	106583
16	48100	111919	111861	40028	43165	111786	47670	44851	47619	43509	109811
17	40768	92433	92266	34004	36827	92177	40529	38280	40422	37391	90944
18	44751	120227	121042	37723	40388	121018	44307	41993	44481	40850	119034
19	27866	59707	59571	23876	25389	59520	27869	26888	27525	25665	58839
20	38787	121109	121114	32699	35313	121075	38301	36915	38480	35652	119137
21	21123	51010	52426	18302	19512	52405	20912	20036	21093	19732	51197
22	21890	56011	57615	18389	20029	57603	21409	20531	21470	20193	56842
X	39065	121177	121775	31971	34489	121804	40412	34719	38394	34280	121026
Y	616	943	930	576	606	926	620	605	585	556	922
M	5	212	206	2	1	206	7	0	6	4	211
-----------------------------------------------------------------------------------------------

Total	1543731	4031093	4056784	1312343	1409510	4055077	1527403	1453659	1532587	1420526	3985822

-----------------------------------------------------------------------------------------------

 hapmap-help@ncbi.nlm.nih.gov

About

The HapMap Data Coordination Center (DCC) coordinates and manages project data flow, data storage, data release and presentation to the community. This includes managing the genotype database and this website. Currently, the DCC is operated by Steve Sherry's group at the National Library of Medicine and National Center for Biotechnology Information of the National Institutes of Health.

Last updated : index.html.en,v 1.19 2007/04/12 20:18:09 tellorui Exp


Home | About the Project | Data | Publications | Tutorial
Please send questions and comments on website to hapmap-help@ncbi.nlm.nih.gov