Hum Genet (2005) 117: 545–557 DOI 10.1007/s00439-005-1328-6
O RI GI N AL IN V ES T IG A T IO N
Shane McCarthy Æ Salim Mottagui-Tabar Yumi Mizuno Æ Bengt Sennblad Æ Johan Hoffstedt Peter Arner Æ Claes Wahlestedt Æ Bjo¨rn Andersson
Complex HTR2C linkage disequilibrium and promoter associations with body mass index and serum leptin
Received: 10 February 2005 / Accepted: 14 April 2005 / Published online: 14 July 2005 Springer-Verlag 2005
Abstract The occurrence of obesity, eating disorders, and related diseases has increased in many parts of the world. Given that few strong genetic factors have been found, it is clear that these are complex multi-factorial diseases. The serotonin receptor 2C, a member of the 5HTergic system, has been implicated in the control of phagia and obesity. We report a detailed investigation of linkage disequilibrium (LD) within and between the HTR2C promoter and the flanking sequences around a commonly utilized marker in the second coding exon of HTR2C. We suggest that inconsistent associations between HTR2C and several phenotypes, including obesity, may be due to the LD pattern across the gene in which recombination and gene conversion have been influential. The nucleotide and haplotype distribution is consistent with that of the neutral mutation model. The number of haplotypes suggests demographic influences or over dominant selection that may have a function in HTR2C expression. Using the fine LD pattern, we describe a possible association with promoter haplotypes and diplotypes, including a GT microsatellite, and body mass index (BMI) ‡30 kgm 2 (P<0.0001). SNP
Electronic Supplementary Material Supplementary material is available for this article at http://dx.doi.org/10.1007/s00439-0051328-6 S. McCarthy (&) Æ S. Mottagui-Tabar Æ Y. Mizuno B. Sennblad Æ C. Wahlestedt Æ B. Andersson Center for Genomics and Bioinformatics, Karolinska Institutet, Berzelius Va¨g 35, 17177 Stockholm, Sweden E-mail:
[email protected] Tel.: +46-8-52483989 Fax: +46-8-311620 J. Hoffstedt Æ P. Arner Department of Medicine, Karolinska Instututet, Karolinska University Hospital at Huddinge, 17177 Stockholm, Sweden B. Sennblad Stockholm Bioinformatics Center, AlbaNova Stockholm University, 10691 Stockholm, Sweden
995G>A heterozygotes, as well as promoter diplotypes, were found to marginally influence higher serum leptin corrected for percentage body fat (P=0.01), which might suggest that these subjects are leptin resistant. Our results complement previous studies of HTR2C in both mice and humans, and suggest the importance of genetic variation and elucidating the fine LD structure in uncovering the genetic factors of obesity.
Introduction A number of studies have indicated that components of the serotonergic system (5-HTergic) are involved in complex neurological, psychiatric, and behavioral disorders such as depression, obsessive-compulsive disorder, aggression, alcoholism, and schizophrenia. Studies using knock-out mice have established a role for the serotonin (5-HT) receptor HTR2C in modulating appetite behavior as well as neuronal excitability (Tecott et al. 1995). The serotonin re-uptake inhibitor d-fenfluramine failed to provoke suppressed ingestion in the mouse HTR2C knock-out model (Vickers et al. 1999), but decreased weight in normal rat models (Vickers et al. 2001). Administration of agonists, such as m-chlorophenylpiperazine (mCPP), resulted in a small but significant decrease in weight and food intake (Sargent et al. 1997), making HTR2C a candidate for disease association and treatment of obesity. In the search for functional variants, a G to C transversion replacing cysteine with serine within the first hydrophobic region of HTR2C was previously identified in a Finnish population (Lappalainen et al. 1995). This variant, rs6318, has been implicated in eating disorders [e.g., anorexia nervosa (Westberg et al. 2002), hypophagia (Quested et al. 1999)] and a reduction in weight in conjunction with HTR2C-specific agonists in the treatment of obesity (Sargent et al. 1997). However, other studies report a lack of association with bulimia,
546
binge eating disorder, underweight children and adolescents (Burnet et al. 1999; Lentes et al. 1997), and obesity, failing to confirm the former findings. Conflicting associations have also been reported with alcoholism (Lappalainen et al. 1999; Parsian and Cloninger 2001), bipolar affective disorder (Gutierrez et al. 1996; Vincent et al. 1999), and migraines (Burnet et al. 1997). Thus the difficulty in replicating associations has strongly questioned the strength of this variant and the influence of HTR2C on these complex disorders. Recently a number of polymorphisms in the promoter of HTR2C have been described. Associations of these with leanness and resistance to type II diabetes in Japanese males (Yuan et al. 2000) and reduced weight gain following treatment with clozapine, a HTR2C antagonist, in schizophrenia patients have been published (Reynolds et al. 2003). However, a di-nucleotide repeat in the same promoter could not be associated with bipolar and panic disorder (Deckert et al. 2000). From these limited studies and their conflicting results, it is clear that more extensive analyses are needed in order to fully evaluate the involvement of HTR2C in complex phenotypes. Inconsistent results in association studies using sequence variants in HTR2C can be due to a number of factors. Poor study design, unrepresentative controls and phenotype heterogeneity may contribute as well as hidden genetic factors. Single site association studies using rs6318 have been dependent on direct phenotype involvement of the SNP or rely on linkage disequilibrium (LD) with the causative variant. Thus, the lack in prior knowledge of the regional LD pattern can also result in contradictory disease associations (Clark et al. 1998; Weiss and Clark 2002). These problems can be reduced by sequencing biologically relevant candidate genes in multiple individuals to identify new variants, determine population-specific diversity and infer indepth LD patterns across and between selected genes. Nucleotide and haplotype diversities and the complexity of LD patterns in a number of genes in different populations has been reported previously and demonstrate the importance of these analyses in disease gene mapping and association (Drysdale et al. 2000; Rieder et al. 1999). We have sequenced key regions within the X-linked HTR2C gene, including the promoter and sequences flanking rs6318, with the aim of elucidating the LD pattern and increasing the possibility of successful association studies. Levels of LD between the promoter and the intragenic regions were weak and there was evidence of recombination and gene conversion. We argue that these findings to a large extent explain the contradictory results in HTR2C studies. Neither locus deviated significantly from the infinite site model, while promoter analysis may indicate a role for over dominant selection. We found associations at the haplotype level with obesity in a Swedish population. Heterozygous diplotypes could also be influented in higher serum leptin concentrations corrected for percentage body fat,
with effects from promoter SNP -995G>A and the GT microsatellite identified in this study. These results support the action of heterosis and suggest possible mechanisms for HTR2C in appetite control and protecting against hyperphagia.
Materials and methods Sample ascertainment For HTR2C sequence variation analysis, blood samples (n=64) were collected randomly from a larger sample of Swedish nonsmoking males between the ages of 20 and 70 years. DNA was prepared from each frozen sample via salt-ethanol precipitation (Miller et al. 1988; Sambrook and Russell David 2001). The 774 samples for association analysis of HTR2C polymorphisms consisted entirely of healthy females with a large inter-individual variation in body mass index (BMI). They were all at least second-generation Swedes and recruited to study the genetics of body weight regulation. BMI status was classified according to WHO criteria. The distribution of BMI was uniform, but lacking values around 30 kg/m2; thus, the samples were categorized into non-obese (<30 kg/m2) or obese (‡30 kg/m2) groups. The mean±standard deviation for ages and BMI for the non-obese and obese were 41±9 and 41±11 years, respectively and 24±3 and 39±6 kg/ m2, respectively. Body fat (percentage) was determined by bioimpedance (model TBF 305, Tanika, Japan). An antecubital venous blood sample was obtained after a 12-h overnight fast and immediately stored at 70C for subsequent DNA preparation and serum leptin measurement using a human leptin radioimmunoassay (RIA) kit (Linco, St. Charles, Mo., USA). It is well established that circulating leptin levels are proportional to body fat; in the present study group this relation was as follows; leptin (ng/ml)=[(0.8 · % body fat) 4], as determined by linear regression analysis (r=0.87). In order to make a valid comparison of leptin levels across the entire range of body fat, serum leptin values were corrected for body fat by expressing it as a leptin/% body fat ratio. The study was approved by the local committee on ethics and explained in detail to each participant and informed consent was obtained. HTR2C locus selection, primer design and sequencing Loci flanking HTR2C SNPs used in previous association studies and possible relevance to HTR2C expression and function were selected for sequencing. Overlapping amplification and nested sequencing primers were designed using the CPrimer program (http://iubio.bio.indiana.edu/soft/molbio/mac/) from GenBank entries GI:3845420 and GI:4753292 containing coding exons 1 and 2, intronic sequence 5’ to exon 1 (5.225 kb, nucleotides 64901–70124) and the HTR2C
547
promoter (1.285 kb nucleotides 30064–31346), respectively. Primers were synthesized by Invitrogen and are listed in Electronic Supplementary Material (ESM) Table 1. All PCR reactions were carried out with AmpliTAQ Gold (Applied Biosystems, Foster City, Calif., USA) using a MJ Research PTC-225 (Wellesley, Mass., USA) under the conditions 95C for 10 min, 30 cycles of 95C for 1 min, 55–58C for 15 s, 72C for 1 min and final extension at 72C for 7 min. Amplicons were visually assessed on a 1.5% TBE-agarose gel. Direct product sequencing was performed using the DYEnamic ET Dye Terminator Cycle Sequencing Kit (Amersham Biosciences) for the Megabace 1000 in halfvolume reactions according to the manufacturers instructions. Reaction conditions were 93C for 30 s 50– 58C for 15 s and 60C for 1 min for 50 cycles. Reads were base-called with Phred, assembled using Phrap (Ewing and Green 1998; Ewing et al. 1998) and viewed using Consed version 13 (Gordon et al. 1998). All SNPs were documented in, and compared with the SNP database dbSNP (http://www.ncbi.nlm.nih.gov/SNP/ snp_ref.cgi?locusId=3358). The power to detect minor allele frequencies (q) between 1% and 5% was determined as 1-(1- q)n where n is the number of chromosomes (Glatt et al. 2001). The SNPs were validated using criteria adapted from (Nickerson et al. 1998). Haplotype frequency and linkage disequilibrium estimation Haplotypes for each region and combined were established using Arlequin, which utilizes the expectation maximum (EM) algorithm (Slatkin and Excoffier 1996). The promoter, intragenic and joint haplotype frequencies were determined using the 64 sequenced males. One microsatellite with three allelic lengths was observed. To permit their inclusion in the Arlequin package, each of these was assigned a letter of the genetic code. Categorized promoter haplotypes were confirmed by sequencing the entire promoter following amplification and cloning into a pGL3 vector (see Expression analysis below). Pairwise non-random association of the SNPs in each region was estimated using |D’| (Lewontin 1964) and r2 (Hill and Robertson 1968). Significance of the allelic associations was assessed by Fishers-exact tests. Nucleotide and haplotype variation analysis The program DnaSP 4.0 (Rozas and Rozas 1999) was used to analyze the extent of nucleotide and haplotype variation. Nucleotide variation was assessed in two ways; firstly, as hw, the number of nucleotides expected to be polymorphic, and secondly, as p, the average proportion of nucleotide differences between all possible pairs of sequences; i.e., the average heterozygosity per site (Hartl and Clark 1997). The correlation between both estimates was assessed using D (Tajima 1989).
Under the infinite sites model, the neutrality of each loci was also examined using D* and F* (Fu and Li 1993). Haplotype diversity and the compliance to neutral theory under the infinite sites model was calculated as described by Nei (1987), and assessed using Fs (Fu 1997) and S (Strobeck 1987) which are measures for comparing the number of observed and expected haplotypes. Recombination analysis The conservative, minimum number of recombination events, Rm was calculated based on the four-gamete test (FGT), where it is assumed that overlapping pairwise haplotypes which share crossovers originate from the same rearrangement (Hudson and Kaplan 1985). Secondly, we estimated the population recombination parameter C=3Ne c, where c is the per-generation recombination rate and Ne is the effective population size (Hudson 1987). This approach is based on the variance of all pairwise differences and as r2 is a function of C, the population recombination parameter is a useful tool to interpret LD and diversity patterns (Andolfatto and Nordborg 1998). Lastly, the ZZ test statistic (Rozas et al. 2001) removes all adjacent neighbor allelic correlations from total overall LD, and was used to assess the effect of recombination on the regions studied. This test works by taking into account the allelic association between nonadjacent polymorphisms and the physical distance between them. The P values for the ZZ test statistics were obtained via coalescent simulations as described elsewhere (Rozas et al. 2001, 2003).
Expression analysis Promoter cloning Primers were designed from GenBank entry GI:4753292 with KpnI- and BglII-tagged restriction sites and ordered from Invitrogen. One of the primers spanned 697G>C (SNP5, ESM Table 1) thus a separate primer for each allele was ordered. Long-range amplifications were performed with either Herculase or PfuUltra hotstart DNA polymerases (Stratagene), with alternative PSNP5 alleles, according the manufacturer’s instructions. Amplicons were specifically extracted from agarose gels using the QIAquick PCR Purification Kit (Qiagen) following the manufacturer’s protocol. The promoter haplotypes were then digested with KpnI and BglII restriction enzymes (New England Biolabs) at 37C for 2 h in the appropriate buffer. Digested products were purified and force ligated into the pGL3-Basic vector system (Promega) using the LigaFast Rapid DNA Ligation System (Promega). Ligation products were transformed into XL-1 Blue competent cells (Stratagene), and then cultured over night on LB Ampicilin agar plates. Promoter inserts were sequence verified on both strands using primers designed previously to
1.3 0.001 0.001 0 1 2 0.069 0.007 0.013 3,848 (0,066) 6.245 (0,038) 5,399 0.0539 0.585±0.049 (0,485) 0.328±0.071 (0,000) 0.577±0.063 0.009 4
±4.0·10 4k ±2.4 ·10 4 ±2.5·10 4 4
Nucelotide polymorphism, Watterson estimator (Watterson 1975) Nucleotide diversity (Nei 1987) c Mutation neutrality, difference between number of segregating sites and average number of nucleotide differences (Tajima 1989) d,e Fu and Li’s neutrality tests, based on the difference between the number of singletons and total number of mutations: d (Fu and Li 1993), eHaplotype diversity (Nei 1987). Probability obtained through coalescent simulation with no recombination f,g Test statistics based on the differences in observed and expected haplotype frequency distributions (Ewens 1972): f Fu’s Fs(Fu 1997) and g Strobeck’s S the probability of observing equal or less haplotypes (Strobeck 1987) h Minimum number of recombination events (Hudson and Kaplan 1985) i Recombination parameter, based on the variance of the average number of nucleotide differences (Hudson 1987) b
D*d Tajima’s Dc Pb
4
9.9·10 6.9·10 7.5·10
FAM-labeled primers flanking the repeat were designed using CPrimer and ordered from (Life Technologies). Sequences are available from the corresponding author. The amplification conditions were 95C for 10 min, 39 cycles of 95C for 30 s, 55C for 1 min and 72C for 1 min. Reactions were concluded with a final extension step at 72C for 1 min. Genotyping was performed at
Promoter Intragenic Combined
Microsatellite genotyping
haW
Promoter SNPs were genotyped via dynamic allele-specific hybridization (DASH) method (Howell et al. 1999), using protocols available from Dynametrix (Hillsboro, Ore., USA) at the website (http://www.dynametrixltd.com). Genotypes were scored from fluorescence curves as previously described (Howell et al. 1999). The primer sequences used for PCR amplification are available from the corresponding author, and were modified to amplify a DNA product with least stable secondary structures using mFold (Zuker 2003). Each allele was probed independently using specific oligo-primers. All genotyping PCR reactions, SNP and microsatellite, were performed on a MJ Research PTC-225 (Wellesley, Mass., USA). Biotinylated products were amplified as follows: 98C for 10 min and 40 cycles of 98C for 10 s and 55C 30 s. Products were visualized using a 2% TBE-agarose gel.
Table 1 Table of summary statistics and test statistics and population parameters
SNP genotyping
a
Genotyping
0.25568 NS 1.18898 NS 1.05302 NS
Hd e
Fs
f
Strobeck’s Sg
h Rm
For comparison to previous studies (Yuan et al. 2000), a P19 mouse teratocarcinoma cell line was cultured in modified Eagle’s medium (MEM) containing nonessential amino acids (NEAA), heat-inactivated fetal bovine serum (FBS) and penicilin-streptomycin L-glutamine (PEST). Twenty-four hours before transfection, the cells were split and seeded in MEM, NEAA, FBS, and antibiotic free L-glutamine at a density of 2·104 cells per well. The haplotype-vector constructs and the pGL3control (Promega) vector were co-transfected with the pRL-TK vector using Lipofectamine 2000 (Invitrogen) according to the manufacturers’ instructions. After 24 h the cells were harvested and prepared for the Dual Luciferase Reporter Assay System (Promega) according to the manufacturer’s instructions. Relative promoter activity for each haplotype was measured as the firefly luciferase/Renilla luciferase ratio, following the adjustment for background luminescence in triplicate. Each triplicate assay was repeated up to five times.
1.53 NS 0.86979 NS 1.10133 NS
CI
Transformation and luciferase assays
16.2·10 4 ±1.4·10 4 8.9·10 4 ±1.9·10 4 10.1·10 4 ±1.6·10 4
ZZ
sequence the overlapping amplicons spanning the region. Positive colonies were cultured at 37C, over night, in LB/ampicillin medium and plasmids were purified using the QIAprep Spin Miniprep Kit (Qiagen).
0,1329 0,061 0,0054 0,493 0.1960 0.000
548
549 Table 2 List of promoter HTR2C haplotypes from 64 males Promoter
Allele
PSNP1
haplotype (PH) PH1 PH2 PH3 PH4 PH5 PH6
freq. 0.538 0.231 0.169 0.031 0.015 0.015
T G G T T T
a
PSNP2
1371
1163 A G G A A A
MS
PSNP3
PSNP4
(GT)n 16 13 13 17 13 16
995 G G A G Z(G)a G
C C T C C C
PSNP5
759
PSNP6
697 G C C G G G
448 T T T T T C
position of SNP relative to +1 nucleotide(Xie et al. 1996). Allele confirmed by sequencing of cloned haplotype in pGL3 luciferase vector
the Uppsala Genome Center (http://www.genomecenter.uu.se/) using an ABI PRISM 3700 Capillary DNA Genotyper. Allelic lengths were determined using GeneScan Genotyper software (Applied Biosystems). PHASE version 2.1 (Stephens and Donnelly 2003; Stephens et al. 2001) was employed to determine the haplotype and diplotype frequencies with and without the promoter microsatellite in the case and control materials. This also permits the assignment of individual diplotypes to the samples. Statistical analysis To evaluate the ability to detect gene-only effects (odds ratio, OR ‡1.6) with the sample sizes described above we used Quanto versus 0.5 (Gauderman 2002). The prevalence of obesity in the female Swedish population was taken to be 10% (Lindstrom et al. 2003) and given that recent evidence suggests heterozygotes at position 759 of the promoter it was assumed that the HTR2C are acting dominantly (Pooley et al. 2004). Hardy–Weinberg equilibrium (HWE) for all SNP loci and the microsatellite, corrected for multiple alleles, was determined using PowerMarker version 3.0. (http://www.powermarker.net). To compare the expression of promoter haplotypes, the luciferase activity mean of each triplicate was assessed using the Wilcoxon test implemented in R (http://www.r-project.org). Genotype, haplotype, and diplotype frequency differences between cases and controls were assessed using chi-squared tests or Fishers’ exact where appropriate. Analysis of variance was used to assess SNP, microsatellite, haplotype, and diplotype effects on serum-leptin levels/% body fat, corrected as described previously. Logistic regression was applied to determine the magnitude of these effects in terms of ORs and confidence intervals (CIs). This statistical analysis was performed using StatView version 5.0.1.
Swedish population, two regions, 1,282 bp of the promoter and 5,223 bp encompassing coding exons 1 and 2, were sequenced in 64 randomly selected male individuals. Approximately 1.16 Mbp were analyzed for SNPs. This provided more than 85% power to detect SNPs with a minor allele frequency greater than or equal to 3% in both regions. In total 26 SNPs were discovered (ESM Table 2). Nineteen of the 20 SNPs located in the intragenic region and all six promoter SNPs could be confirmed on both strands. Twelve of the intragenic SNPs had a less common allele frequency >2%, whereas three of the promoter variants had a minor allele frequency >3%. The remaining SNPs in both loci had minor allele frequencies lower than 2% (ESM Fig. 1). Compared with dbSNP (http://www.ncbi.nlm.nih.gov/SNP/; ESM Table 2) the rare SNPs (PSNP6, ISNP8 and ISNP16) were unique to this sample set. The previously described cysteine to serine substitution rs6318 (ISNP20) was found at a similar frequency (20%) to that in other populations; e.g., Finnish (Lappalainen et al. 1995). A previously reported valine to leucine substitution caused by a C-G transversion (rs2228669) was not found. The GT microsatellite (Deckert et al. 2000) was also polymorphic in our sample set. Using start- and endpoints defined in previous studies and validation by sequencing of cloned PCR products from both ends, we found three alleles of lengths 13, 16, and 17 repeats (ESM Table 3) (Meyer et al. 2002; Yuan et al. 2000). Of these, the 16-repeat allele was the most frequent (ESM Table 3). However, a CA repeat previously described by Deckert et al. (2000) downstream of the GT microsatellite was not polymorphic [(CA)·5]. The frequencies of the GT repeat were comparable with previous studies. In a German sample set (Meyer et al. 2002), alleles containing (GT)16/15 and (GT)12 repeats were most common. The combination of (GT)16 and (CA)5, written here as (16/5), was rare in that study, while 16/4 and 15/5, although indistinguishable by length, were the most frequent allele combination. In a Japanese male sample group (Yuan et al. 2000), 17 and 14 (GT) copies were the most common alleles.
Results HTR2C SNP discovery
Nucleotide and haplotype diversity and neutrality
In an effort to characterize HTR2C linkage disequilibrium as well as nucleotide and haplotype variation patterns in a
Values of hw for the promoter region (9.9·10 4) were elevated compared with that of the intragenic region
(6.9·10 4) and both regions combined (7.5·10 4) (Table 1). The average nucleotide heterozygosity, p, for each region was 16.2·10 4 and 8.9·10 4, respectively, and for the combined regions, p was 10.1·10 4 . These values fall within the ranges of nucleotide diversity from prior studies across other X-linked loci (Jaruzelska et al. 1999; Nachman et al. 1998). Additionally, neither Tajima’s D, nor Fu and Li’s D* imply that the discrepancies between p and hw were significant (Table 1), but most likely due to the increase in sample size. This makes it clear that the allele distribution conformed to a neutral mutation hypothesis (ESM Fig. 1). Hence, it is expected that two chromosomes will differ on average once every 617 and 1,055 bp in the promoter and intragenic regions, respectively. A total of six intragenic haplotypes were found for the X-linked gene (Tables 2, 3), of which the most common, IH1 accounted for 81% of the male chromosomes. The majority of lower frequency haplotypes differed from the IH2 haplotype by just one or two nucleotides. Three of the six promoter haplotypes (PH1,PH2, and PH3) comprised over 80% of the haplotype variation. Combining both regions a total of 11 haplotypes were observed in which the most frequent haplotype PH1/IH1 accounted for 58% of the overall variation. The number of promoter and combined haplotypes was reduced to four and nine, respectively, upon exclusion of the microsatellite (ESM Table 4). Promoter haplotypes PH1 and PH2 occurred most frequently with haplotypes IH1 and IH2, respectively. PH3, which is most similar to PH2, was more often observed with IH1, indicating possible crossover events. Furthermore, haplotype clustering in both regions according to similarity revealed additional haplotypes, potentially resulting from crossover events (ESM Table 4). Haplotype diversity (Hd), without the microsatellite, was approximately 0.585 for the promoter. Hd in the intragenic region was 0.328, while the combined Hd was 0.577 (Table 1), similar to that of the promoter. DnaSP 4.0 could not measure diversity with the microsatellite, thus the estimated diversity perhaps undervalues the true promoter and combined heterozygosity, which are both greater than the intragenic region. For the intragenic and regions combined, the large positive Fs were significant and higher than that of the promoter region (promoter: 3.848; intragenic: 6.245; combined: 5.399). Strobeck’s S displays a similar trend, complementing Fu’s Fs. These results possibly indicate a deficit in recent mutations and haplotypes given the levels of diversity (Table 1) and could indicate possible population subdivision or balancing selection. Since both measures are based on h calculated from the number of pairwise differences, genetic crossovers possibly best explain the decrease in both Fs and Strobeck’s S toward and within the promoter.
0.813 0.125 0.010 0.010 0.010 0.010
Intragenic haplotypes (IH) IH1 IH2 IH3 IH4 IH5 IH6
Allele freq.
G A G A A A
A G A G G G
C A C A A A
T A A A A A
T C T C C C
C C C C C T
A G A G G G
A G A G G G
G A G A A A
G T G T T T
G A G A A A
C T C T T T
G C G C C C
G G G G G G
A G A G G G
A G A G G G
A G A A G G
G C G C G C
IESNP3 IESNP4 IESNP5 IESNP6 IESNP7 IESNP8 IESNP9 IESNP10 IESNP11 IESNP12 IESNP13 IESNP14 IESNP15 IESNP16 IESNP17 IESNP18 IESNP19 IESNP20
Table 3 List of HTR2C intragenic haplotypes from 94 males, including rare SNPs
550
Linkage disequilibrium Strong LD (|D’|‡0.8) was observed across and between the promoter and intragenic regions which are separated
551
by 150 kb (Fig. 1). Such extensive levels have been recorded previously in the gene using |D’| and other regions of the X chromosome (Gutierrez et al. 1996; Taillon-Miller et al. 2000). However according to r2, LD was much lower between the two regions, indicating ancestral crossover, but remained significant following Bonferroni correction (Fig. 1). LD across the promoter decreased rapidly using r2 considering the distance between the promoter SNPs. Promoter SNPs 995G>A (PSNP3) and 759C>T (PSNP4) did not show reliable LD with rs6318 (IESNP20) in coding exon 2, although |D’| was equal to one. It has been suggested that high |D’|, but low r2, as found here, may be best explained by gene conversion (Andolfatto and Nordborg 1998; Frisse et al. 2001). Interestingly 995G>A (PSNP3) and 759C>T (PSNP4) exhibited similar r2 values with the remaining promoter SNPs as the latter showed with the intragenic region. Thus, our results suggest that ancestral crossover caused by gene conversion and recombination could be contributing to the lack of allelic association with rs6318 (ISNP20). Recombination and gene conversion events Using the FGT, Rm (Hudson and Kaplan 1985), we observed one recombination event in the intragenic region, concentrating around rs6318. Another recombination event had occurred between the promoter and intragenic regions (Table 1). Even though we found greater nucleotide and haplotype diversity, no recomFig. 1 Linkage disequilibrium (LD) shown by |D’| and r2, within and between the HTR2C promoter (PSNP) and intragenic (IESNP) regional SNPs. For clarity the rare SNPs in both regions have been removed. In terms of |D’| both regions show high degree of LD (|D’|‡0.8). According to r2, there is insignificant LD between promoter SNPs PSNP3 and PSNP4 r2 £ (0.1) and the intragenic region while in weak LD with the remaining promoter SNPs (0.1‡r2 £ 0.3). * IESNP20 is the G-C transversion rs6318 commonly used in associations with HTR2C-related phenotypes
bination events took place in the promoter itself, while the population recombination parameter, C, was found to be much greater than across the intragenic region (1.3 vs 0.001; Table 1). If the local recombination parameter from combining both regions (0.001) is taken as negligible, then the elevated parameter value in the promoter is due to gene conversion and explains the sudden drop in LD between SNPs (Andolfatto and Nordborg 1998). We additionally used the ZZ test (Rozas et al. 2001) from which it is possible to get an estimate C from the ZZ statistic although we have not done this here (Table 1). Nevertheless, our results clearly show that ancestral recombination has occurred between the two regions and that gene conversion within the promoter has altered the LD pattern. Promoter expression analysis We investigated the effect of genetic variance by most common haplotypes promoter variants on expression in a luciferase expression vector. In contrast to previous results (Yuan et al. 2000), we found that although the haplotype GG13ATC did express an average 17% lower than the most common haplotype TA16GCG, this was not significant (P=0.12; Fig. 2). On the other hand, the second most common haplotype GG13GCC expressed significantly 21% lower than TA16GCG (P=0.0075). Luciferase activity could be manipulated at a strong transcription start point (Xie et al. 1996), although not significantly by replacing the 697C allele with G to
552 Fig. 2 Expression analysis of promoter haplotypes using a dual luciferase reporter gene assay. Expression levels are presented as percentages relative to the most common haplotype TA16GCG. Haplotypes were manipulated at position 697 by replacement with opposite alleles. CONTROL and BASIC are the pGL3-Control and pGL3-Basic vectors, respectively
increase the expression of GG13ATG and GG13GCG to 26% (P=0.055) and 23% (P=0.09) above GG13ATC and GG13GCC, respectively. Replacing the 697G allele of TA16GCG with C (TA16GCC), then expression was significantly reduced by 17% (P=0.011). Thus, there appears to be variation in the expression of the most common HTR2C promoter haplotypes, in which the 697C allele results in a reduced expression that is further decreased in the presence of the 995G allele and 759C allele in the haplotype, perhaps due to gene conversion. Length of the GT microsatellite could also be influential in expression compensation. Possible alterations in the affinity of transcription factors and complexes at HTR2C promoter polymorphism sites
could possibly contribute to variability in gene expression. Analysis using JASPAR (Sandelin et al. 2004) predicted that the 995G allele may alter the binding of transcription factors MZF (myeloid zinc finger) and SP1. No such disruptions were found for 759C>T. Extension of the microsatellite, separated from 995G>A (PSNP3) by just 32 bp, could influence the binding of the RAS-responsive element binding protein 1 (RREB1). Although it appears here that gene conversion is a factor in expression, further extensive analysis is required to fully understand the consequences on transcription factor and complex binding.
Table 4 HTR2C promoter haplotype analysis. SNP alleles are shown beginning with most distant from the +1 nucleotide [ 1371T>G (PSNP1), 1163A>G (PSNP2), GT microsatellite, 995G>A (PSNP3), 697G>C (PSNP5). Allele frequency differences between cases (BMI ‡30 kgm 2) and controls (BMI <30 kgm 2) are significantly different (P=<0.0001) Haplotype
TA16GG GG13AC GG13GC TA17GG a TA13GG TA18GG a GG16AC GG13GG GG16GC TA16GC GG13AG GA13GG GA16GG TG13GC GG16GG TG16GG Total (2N) a
BMI 2
BMI‡30 kgm
2
Total
n
Frequency
n
Frequency
n
Frequency
586 156 140 15 12 11 0 9 9 9 7 2 2 2 1 1 962
0.609 0.162 0.15 0.016 0.012 0.011 0.000 0.009 0.009 0.009 0.007 0.002 0.002 0.002 0.001 0.001
329 94 81 9 31 3 13 6 11 4 1 0 1 1 1 1 586
0.561 0.160 0.14 0.015 0.053 0.005 0.022 0.010 0.019 0.007 0.002 0.000 0.002 0.002 0.002 0.002
915 250 221 24 43 14 13 15 20 13 8 2 3 3 2 2 1548
0.591 0.161 0.143 0.016 0.028 0.009 0.008 0.010 0.013 0.008 0.005 0.001 0.002 0.002 0.001 0.001
Haplotypes GG16AC and TA13GG (OR=4.42, 95% CI=2.25, 8.68) are the primary contributors
553
HTR2C promoter associations with BMI and serum leptin/% body fat
Table 5 HTR2C promoter diplotypes analysis. Frequency differences between cases (BMI ‡30 kgm 2) and controls (BMI <30 kgm 2) are significantly different (P=0.0006)
The polymorphisms that provided most information about diversity at the HTR2C promoter [ 1371T>G (PSNP1), 1163A>G (PSNP2), 995G>A (PSNP3), 697G>C (PSNP5), and the GT microsatellite] were genotyped in the case (BMI ‡30 kgm 2, n=293) and control (BMI <30 kgm 2, n=481) study. With such samples sizes and a population risk of 10%, the power to uncover weak genetic associations (OR=1.6) under a dominant model was estimated to be 53, 75, and 89% for disease allele frequencies 5, 10, and 25%, respectively. The microsatellite and SNPs conformed to Hardy-Weinberg equilibrium expectations in the control material (P>0.26). In comparison to the male sample set used for SNP discovery, no significant differences were found in the frequencies of the most common haplotypes although PH2 is the third most frequent in the female group. This may have been due to a greater female sample size. Assessment of allele, genotype and microsatellite-free haplotype frequencies failed to show any significant difference between cases and controls. However, when testing the entire promoter haplotypes, a significant deviation frequency distribution was found between cases and controls, especially for GG16AC and TA13GG (P<0.0001, Table 4). These were found with a greater frequency in the case material than in controls (TA13GG: 5 vs 1%; GG16AC: 2 vs 0%). The genetic risk of obesity of for a TA13GG carrier was increased compared with controls (OR=4.42; 95% CI=2.25, 8.68). The OR for GG16AC was not estimated, as there was no account of this haplotype in the control group. Given the very low haplotype frequency these results should be interpreted with caution and a greater sample size is required to determine the presence of this low frequency haplotype in the normal population. Differences in diplotype frequencies were found for TA13GG/TA16GG and rare diplotypes (P=0.0006; Table 5). The rare diplotype group contained the GG16AC alleles that contributed to the significant results above. Selective removal and inclusion of specific (rare) diplotypes in the analysis (not presented), showed that the diplotypes TA13GG/TA16GG (OR=5.8; 95% CI=2.3, 14.6), TA13GG/TA13GG (OR=6.64; 95% CI=0.74, 59.73) and GG16GC/GG16AC were the primary contributors to this association. The OR for homozygous diplotype TA13GG bridged a value of 1.0, thus the real contribution of this particular diplotype remains doubtful. GG16AC/TA16GG also strongly contributed to the association, but as the GG16AC haplotype was not observed in the control material odds ratios were not calculated. It has been shown that the C allele and heterozygotes at 759 (PSNP4) were found more often in an obese group (BMI ‡30 kgm 2) compared to a non-obese female group (BMI £ 25 kgm 2) (Pooley et al. 2004). Given the degree of LD between 995G>A (PSNP3) and 759C>T (PSNP4) shown by our study, we tried
Diplotype
BMI<30 kgm
TA16GG/TA16GG
173 (0.36) 98 (0.2) 96 (0.2) 21 (0.04) 6 (0.01) 14 (0.03) 9 (0.02) 8 (0.02) 10 (0.02) 46 (0.1) 481
GG13AC/TA16GG GG13GC/TA16GG GG13GC/GG13AC TA13GG/TA16GGa GG13AC/GG13AC TA16GG/TA17GG GG13GC/GG13GC TA16GG/TA18GG Rareb Total (n)
2
BMI‡30 kgm 90 (0.31) 54 (0.18) 46 (0.16) 16 (0.05) 20 (0.07) 7 (0.02) 6 (0.02) 7 (0.02) 2 (0.01) 45 (0.15) 293
2
Total 263 (0.34) 152 (0.2) 142 (0.18) 37 (0.05) 26 (0.03) 21 (0.03) 15 (0.02) 15 (0.02) 12 (0.02) 91 (0.12) 774
a
Also contributing to the Rare signal Haplotypes contributing to the signal: (1) TA13GG/TA16GG (OR=5.8, 95% CI=2.3, 14.6), (2) TA13GG/TA13GG (OR=6.64, 95% CI=0.74, 59.73), (3) GG16AC/TA13GG, (4) GG16AC/ GG16AC b
to replicate their study in our material using 995G>A (PSNP3). Power for this study was minimally reduced (not shown). We were unable to repeat the association, but nevertheless found the same haplotypes and diplotypes shown above in association (P<0.0001, 0.0006, respectively). Diplotypes were again heterozygous for either SNPs or the microsatellite supporting the hypothesis that heterosis at the HTR2C promoter plays an influential role in modulating weight and BMI (Pooley et al. 2004). Serum-leptin levels were available for 459 individuals, corrected for the influences of body fat leptin levels (R2=0.65), age and BMI. The resulting distribution of serum leptin/% body fat was normally distributed. The 995G>A (PSNP3) polymorphism was found to associate with serum leptin/% body fat (P= 0.01, Bonferroni corrected P=0.03; Table 6). Heterozygotes ( 995G/A) showed higher serum leptin/% body fat ratios compared with the homozygotes, but warrants caution following Bonferroni correction. A higher, but insignificant, 995G/G serum leptin/% body fat ratio than 995A/A was also observed. Differences between the diplotypes TAGG/GGAC and GGAC/GGAC were evident (P=0.047). The latter also showed differences with GGGC/GGAC (P=0.0234). Other comparisons showed trends toward significance in which one of the diplotypes was heterozygous at 995G>A (PSNP3) (underlined). Inclusion of the microsatellite showed trends toward significance between the two most common diplotypes TA16GG/
554
TA16GG and GG13AC/TA16GG [P=0.0389; Table 6, GT microsatellite and 995G>A (PSNP3) underlined]. Further significant differences were found between TA13GG/TA16GG and GG13AC/GG13AC (P=0.0134), TA16GG/TA16GG and TA13GG/ TA16GG (P=0.0220), and GG13AC/TA16GG and GG13AC/GG13AC (P=0.0345). In all comparisons, diplotypes with a higher serum leptin/% body fat were heterozygous at either the GT microsatellite, 995G>A (PSNP3) or both. This was more evident in those with an increased risk for obesity as noted above, for example, TA13GG/TA16GG (OR=5.8; 95% CI=2.3, 14.6). These results suggest perhaps an influential role by the microsatellite and 995G/A and that 995G/A heterozygotes show higher ratios, consistent with lower ratios in 995A/A homozygotes suggestive of a possible interaction between HTR2C and serum leptin. We stress that care needs to be taken given the levels of significance and the number of multiple tests performed. Following Bonferroni correction these diplotype assessments were not significant.
Discussion Interest in the serotonin receptor 2C (HTR2C) as a candidate gene for weight gain and obesity stems from knock-out mice models (Tecott et al. 1995) and pharmacological studies (Sargent et al. 1997). Nevertheless, previous HTR2C studies aimed at identifying diseaseassociated alleles have been inconsistent (Lentes et al. 1997; Westberg et al. 2002). Among the different elements of association studies factoring in the ability to find disease alleles, knowledge of the gene population and LD structure is of importance. Here we sought to
Table 6 Analysis of variance of the serum leptin/% body fat distribution of across the genotypes of HTR2C promoter SNP3 Genotypea
Mean serum leptin/% body fat ± SD
G/G G/Ab A/A Diplotypec TA16GG/TA16GG GG13AC/TA16GG GG13GC/TA16GG GG13GC/GG13AC TA13GG/TA16GGd GG13AC/GG13AC TA16GG/TA17GG GG13GC/GG13GC TA16GG/TA18GG Rare
0.583±0.287 0.663±0.358 0.479±0.191
a
0.555±0.275 0.638±0.336 0.627±0.323 0.633±0.329 0.706±0.236 0.453±0.189 0.544±0.160 0.664±0.289 0.468±0.203 0.643±0.367
P=0.0109 Significant differences between G/A and G/G, A/A homozygotes P= 0.0117 and 0.0239, respectively c P=0.0848 d Differences in the distribution of serum leptin/% body fat with TA13GG/TA16GG (P=0.0220) and GG13AC/GG13AC (P =0.0134) b
determine this structure and underlying events, the patterns in HTR2C through comparative sequencing around commonly used makers and possible functional variants, then apply this information in association with obesity and serum-leptin levels. Although nucleotide diversity estimates for other serotonin receptors are much lower than those reported here for HTR2C, because of the focus previously on exons (Glatt et al. 2004), we found the distribution of HTR2C SNPs consistent, under the infinite sites model, with the neutral mutation hypothesis. Haplotype distribution and diversity analysis indicate that there is a haplotype deficit for the given level of diversity and a dominance of only two and three haplotypes in the intragenic and combined regions (Fu 1997; Strobeck 1987). Under the assumption that population dynamics acts across larger regions, these observations suggest balancing (over dominant) selection or population subdivision (Fu 1997; Strobeck 1987) and perhaps explain the increase in Tajima’s D and Fu and Li’s D* through a decrease in the number of recent mutations (Fu and Li 1993). On the other hand, crossover events such as recombination and gene conversion, marked by the FGT, population recombination parameter C and drift, would reverse these effects and is evident by the decrease in Fs and S between the regions and particularly in the HTR2C promoter. A functional role for balancing selection is perhaps marked by the influence of 697G>C (PSNP5) alleles on expression. Increases in luciferase expression are evident in the presence of the G allele on all haplotypes over the C allele, while the counter haplotypes TA16GCG and GG13ATC show no significant differences. This could imply that the other polymorphisms are important regulation compensators. Hence the reduced expression of GG13GCC compared with the most common haplotype might suggest a functional influence by gene conversion. However, no significant differences were found between GG13ATC and GG13GCC, due to the wide distribution of GG13ATC mean measures, but the upward shift in the median of GG13ATC would suggest that 995A and 759T are important for expression regulation. Previously Yuan et al. (2000), and more recently Buckland et al. (2005), have shown that 759T is especially important in expression (Buckland et al. 2005). As a result, through stability between promoter SNPs, balancing selection may be in action to maintain HTR2C expression and disrupting the balance perhaps by gene conversion is influential on expression and disease. Although our study is the first to consider the entire promoter, analysis of other genomic regions and additional populations is required to determine if the observed nucleotide, haplotype and expression pattern is a result of balancing selection. However, if the increased number of studies associating alleles at 759 with phenotypes, such as obesity and weight gain is related to its function, the underlying LD pattern observed in our study is critical in correlating HTR2C markers with disease phenotypes.
555
Strong LD across HTR2C has been seen previously between the promoter GT microsatellite and the commonly used association marker rs6318 (|D’|=1, P=0.000001) (Gutierrez et al. 1996). However our study indicated that 995G>A (PSNP3) and 759C>T (PSNP4) were not in significant LD with rs6318 r2>0.1), and showed similar low levels to the adjacent promoter SNPs. The remaining SNPs demonstrated values near r2>0.33, the lower limit considered advantageous for the use of markers in disease association mapping (Ardlie et al. 2002). If recent evidence is true that promoter SNPs 995G>A (PSNP3) and 759C>T (PSNP4) are involved in HTR2C contributions to obesity and weightgain, then the level of LD between the HTR2C promoter and the intragenic region is insufficient for mapping disease-related variants. Even utilizing adjacent promoter polymorphisms will require a three- to five-fold increase in sample size to find association with the same power as testing the susceptibility SNP(s) directly. Hence this study highlights the importance of understanding the LD of candidate genes and the events that shaped this pattern. Events like gene conversion if frequent, could impact the use of SNPs in a number of other candidate genes where it is assumed that genetic markers are in LD with disease polymorphisms. Unfortunately we were unable to find SNP associations directly with BMI while promoter haplotypes, most notably TA13GG, showed an increased risk for obesity (P<0.0001). We also found deviations in the diplotype distribution of BMI and serum leptin/% body fat ratios (P=0.0006) and heterozygotes at either the GT microsatellite or 995G>A (PSNP3). Although the reduced dimensionality of haplotype and diplotype analysis may give greater power to discern cis- and trans-allelic combinations that may influence a trait, such as heterozygosity or disruption of haplotype balance by gene conversion, given the low frequency of these haplotypes, a larger replicate study is required to verify the observations. Nevertheless, associations of 995G>A (PSNP3) and elevated serum leptin/% body fat ratios were marginally significant in which heterozygotes showed higher levels compared with both homozygotes, supporting the hypothesis of heterosis. However these observations warrant caution following Bonferroni correction. Heterozygosity at 759C>T (PSNP4), which is in LD with position 995G>A (PSNP3), has been shown to influence higher triglyceride levels in women (Pooley et al. 2004). Additionally, a recent study of schizophrenics administered clozapine, reports weight gains but also increases in serum triglyceride and leptin levels (Atmaca et al. 2003). Complementation of our study to these associations may provide more evidence that heterosis at the HTR2C promoter may be important. Thus male X-chromosome hemizygosity possibly suggests different sex effects and the main heterozygous effects seen only in females perhaps mediated through interactions with serum leptin. There is strong evidence for an interaction between leptin and the 5-HTergic system, although other studies
might dispute this (Nonogaki et al. 1998). Leptin has been shown to stimulate 5-HTergic activity in the brain of mice as well as in other animal species (Finn et al. 2001). HTR2C density and mRNA have been found to increase in the hypothalamus of mice on high-fat diets (Huang et al. 2004), in response to low 5-HT concentrations in the ventromedial hypothalamus nucleus (VHM). Increases in leptin but a reduction in HTR2C mRNA in the hypothalamus have also been observed in high-fat diets (Schaffhauser et al. 2002). In rats, leptininduced anorexia and milk intake was attenuated by the HTR2C antagonists SB242084 and SB206553, respectively (von Meyenburg et al. 2003; Yamada et al. 2003). Hypophagia was suppressed by the HT releaser and uptake inhibitor d-fenfluramine (Vickers et al. 2001). Susceptibility to hyperphagia resulting in obesity could possible be in response to low 5-HT or poor expression of the HTR2C receptor given high levels of leptin. Our data showing a strong genotype effect of HTR2C polymorphism 759G>A on body fat adjusted circulating leptin levels indirectly support the notion that HTR2C is involved in the leptin signaling pathways. It is possible that 995 G/A heterozygotes are more leptin resistant, having higher leptin levels than could be expected from their body fat content. While there are several other possible reasons for the lack of power and inconsistencies in association studies (Cardon and Bell 2001), we conclude that the population and LD structure of HTR2C has been an influence on associations where functional promoter variants may be contributing to susceptibility to obesity and hyperphagia. Complementing evidence of over-dominance and heteosis, suggest they could be features of the promoter to which gene conversion could be important. We suggest that future studies of HTR2C should include variants of the promoter representative of its structure, the sequencing of longer regions from biologically relevant genes and use nucleotide and haplotype distribution and diversity to understand the action of gene conversion on local and long-range LD. Acknowledgements We thank the Swedish Research Council, Novo Nordic, the Swedish Diabetes Foundation and the Foundation, for Strategic Research, Sweden for funding this work.
References Andolfatto P, Nordborg M (1998) The effect of gene conversion on intralocus associations. Genetics 148:1397–1399 Ardlie KG, Kruglyak L, Seielstad M (2002) Patterns of linkage disequilibrium in the human genome. Nat Rev Genet 3:299–309 Atmaca M, Kuloglu M, Tezcan E, Ustundag B (2003) Serum leptin and triglyceride levels in patients on treatment with atypical antipsychotics. J Clin Psychiatr 64:598–604 Buckland PR, Hoogendoorn B, Guy CA, Smith SK, Coleman SL, O’Donovan M C (2005) Low gene expression conferred by association of an allele of the 5-HT2C receptor gene with antipsychotic-induced weight gain. Am J Psychiatry 162:613–615 Burnet PW, Harrison PJ, Goodwin GM, Battersby S, Ogilvie AD, Olesen J, Russell MB (1997) Allelic variation in the serotonin 5HT2C receptor gene and migraine. Neuroreport 8:2651–2653
556 Burnet PW, Smith KA, Cowen PJ, Fairburn CG, Harrison PJ (1999) Allelic variation of the 5-HT2C receptor (HTR2C) in bulimia nervosa and binge eating disorder. Psychiatr Genet 9:101–104 Cardon LR, Bell JI (2001) Association study designs for complex diseases. Nat Rev Genet 2:91–99 Clark AG, Weiss KM, Nickerson DA, Taylor SL, Buchanan A, Stengard J, Salomaa V, Vartiainen E, Perola M, Boerwinkle E, Sing CF (1998) Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. Am J Hum Genet 63:595–612 Deckert J, Meyer J, Catalano M, Bosi M, Sand P, DiBella D, Ortega G, Stober G, Franke P, Nothen MM, Fritze J, Maier W, Beckmann H, Propping P, Bellodi L, Lesch KP (2000) Novel 5’regulatory region polymorphisms of the 5-HT2C receptor gene:association study with panic disorder. Int J Neuropsychopharmacol 3:321–325 Drysdale CM, McGraw DW, Stack CB, Stephens JC, Judson RS, Nandabalan K, Arnold K, Ruano G, Liggett SB (2000) Complex promoter and coding region beta 2-adrenergic receptor haplotypes alter receptor expression and predict in vivo responsiveness. Proc Natl Acad Sci USA 97:10483–10488 Ewens WJ (1972) The sampling theory of selectively neutral alleles. Theor Popul Biol 3:87–112 Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8:186–194 Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer traces using phred I accuracy assessment. Genome Res 8: 175–185 Finn PD, Cunningham MJ, Rickard DG, Clifton DK, Steiner RA (2001) Serotonergic neurons are targets for leptin in the monkey. J Clin Endocrinol Metab 86:422–426 Frisse L, Hudson RR, Bartoszewicz A, Wall JD, Donfack J, Di Rienzo A (2001) Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. Am J Hum Genet 69:831–843 Fu YX (1997) Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147: 915–925 Fu YX, Li WH (1993) Statistical tests of neutrality of mutations. Genetics 133:693–709 Gauderman WJ (2002) Sample size requirements for matched casecontrol studies of gene-environment interaction. Stat Med 21:35–50 Glatt CE, DeYoung JA, Delgado S, Service SK, Giacomini KM, Edwards RH, Risch N, Freimer NB (2001) Screening a large reference sample to identify very low frequency sequence variants:comparisons between two genes. Nat Genet 27:435–438 Glatt CE, Tampilic M, Christie C, DeYoung J, Freimer NB (2004) Re-screening serotonin receptors for genetic variants identifies population and molecular genetic complexity. Am J Med Genet 124B: 92–100 Gordon D, Abajian C, Green P (1998) Consed: a graphical tool for sequence finishing. Genome Res 8:195–202 Gutierrez B, Fananas L, Arranz MJ, Valles V, Guillamat R, van Os J, Collier D (1996) Allelic association analysis of the 5-HT2C receptor gene in bipolar affective disorder. Neurosci Lett 212:65–67 Hartl DL, Clark AG (1997) Principles of population genetics, 3rd edn. Sinauer, Sunderland Hill W, Robertson A (1968) Linkage disequilibrium in finite populations. Theor Appl Genet 38:226–239 Howell WM, Jobs M, Gyllensten U, Brookes AJ (1999) Dynamic allele-specific hybridization A new method for scoring single nucleotide polymorphisms. Nat Biotechnol 17:87–88 Huang XF, Huang X, Han M, Chen F, Storlien L, Lawrence AJ (2004) 5-HT(2A/2C) receptor and 5-HT transporter densities in mice prone or resistant to chronic high-fat diet-induced obesity: a quantitative autoradiography study. Brain Res 1018:227–235 Hudson RR (1987) Estimating the recombination parameter of a finite population model without selection. Genet Res 50:245–250
Hudson RR, Kaplan NL (1985) Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111:147–164 Jaruzelska J, Zietkiewicz E, Batzer M, Cole DE, Moisan JP, Scozzari R, Tavare S, Labuda D (1999) Spatial and temporal distribution of the neutral polymorphisms in the last ZFX intron: analysis of the haplotype structure and genealogy. Genetics 152:1091–1101 Lappalainen J, Long JC, Virkkunen M, Ozaki N, Goldman D, Linnoila M (1999) HTR2C Cys23Ser polymorphism in relation to CSF monoamine metabolite concentrations and DSM-III-R psychiatric diagnoses. Biol Psychiatr 46:821–826 Lappalainen J, Zhang L, Dean M, Oz M, Ozaki N, Yu DH, Virkkunen M, Weight F, Linnoila M, Goldman D (1995) Identification, expression, and pharmacology of a Cys23-Ser23 substitution in the human 5-HT2c receptor gene (HTR2C). Genomics 27:274–279 Lentes KU, Hinney A, Ziegler A, Rosenkranz K, Wurmser H, Barth N, Jacob K, Coners H, Mayer H, Grzeschik KH, Schafer H, Remschmidt H, Pirke KM, Hebebrand J (1997) Evaluation of a Cys23Ser mutation within the human 5-HT2C receptor gene: no evidence for an association of the mutant allele with obesity or underweight in children, adolescents and young adults. Life Sci 61:PL9–16 Lewontin RC (1964) The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49:49–67 Lindstrom M, Isacsson SO, Merlo J (2003) Increasing prevalence of overweight, obesity and physical inactivity: two populationbased studies 1986 and 1994. Eur J Public Health 13:306–312 Meyer J, Saam W, Mossner R, Cangir O, Ortega GR, Tatschner T, Riederer P, Wienker TF, Lesch KP (2002) Evolutionary conserved microsatellites in the promoter region of the 5-hydroxytryptamine receptor 2C gene (HTR2C) are not associated with bipolar disorder in females. J Neural Trans 109:939–946 Miller SA, Dykes DD, Polesky HF (1988) A simple salting out procedure for extracting DNA from human nucleated cells. Nucleic Acids Res 16:1215 Nachman MW, Bauer VL, Crowell SL, Aquadro CF (1998) DNA variability and recombination rates at X-linked loci in humans. Genetics 150:1133–1141 Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New York Nickerson DA, Taylor SL, Weiss KM, Clark AG, Hutchinson RG, Stengard J, Salomaa V, Vartiainen E, Boerwinkle E, Sing CF (1998) DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene. Nat Genet 19:233–240 Nonogaki K, Strack AM, Dallman MF, Tecott LH (1998) Leptinindependent hyperphagia and type 2 diabetes in mice with a mutated serotonin 5-HT2C receptor gene. Nat Med 4:1152–1156 Parsian A, Cloninger CR (2001) Serotonergic pathway genes and subtypes of alcoholism: association studies. Psychiatr Genet 11:89–94 Pooley EC, Fairburn CG, Cooper Z, Sodhi MS, Cowen PJ, Harrison PJ (2004) A 5-HT2C receptor promoter polymorphism (HTR2C-759C/T) is associated with obesity in women, and with resistance to weight loss in heterozygotes. Am J Med Genet 126B:124–127 Quested DJ, Whale R, Sharpley AL, McGavin CL, Crossland N, Harrison PJ, Cowen PJ (1999) Allelic variation in the 5-HT2C receptor (HTR2C) and functional responses to the 5-HT2C receptor agonist, m-chlorophenylpiperazine. Psychopharmacology (Berl) 144:306–307 Reynolds GP, Zhang Z, Zhang X (2003) Polymorphism of the promoter region of the serotonin 5-HT(2C) receptor gene and clozapine-induced weight gain. Am J Psychiatry 160:677–679 Rieder MJ, Taylor SL, Clark AG, Nickerson DA (1999) Sequence variation in the human angiotensin converting enzyme. Nat Genet 22:59–62 Rozas J, Gullaud M, Blandin G, Aguade M (2001) DNA variation at the rp49 gene region of Drosophila simulans:evolutionary inferences from an unusual haplotype structure. Genetics 158:1147–1155
557 Rozas J, Rozas R (1999) DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174–175 Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R (2003) DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19:2496–2497 Sambrook J, Russell David W (2001) Molecular cloning : a laboratory manual, 3rd edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor Sandelin A, Alkema W, Engstrom P, Wasserman WW, Lenhard B (2004) JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 32, database issue:D91–94 Sargent PA, Sharpley AL, Williams C, Goodall EM, Cowen PJ (1997) 5-HT2C receptor activation decreases appetite and body weight in obese subjects. Psychopharmacology (Berl) 133:309– 312 Schaffhauser AO, Madiehe AM, Braymer HD, Bray GA, York DA (2002) Effects of a high-fat diet and strain on hypothalamic gene expression in rats. Obes Res 10:1188–1196 Slatkin M, Excoffier L (1996) Testing for linkage disequilibrium in genotypic data using the expectation-maximization algorithm. Heredity 76:377–383 Stephens M, Donnelly P (2003) A comparison of bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet 73:1162–1169 Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68:978–989 Strobeck C (1987) Average number of nucleotide differences in a sample from a single subpopulation: a test for population subdivision. Genetics 117:149–153 Taillon-Miller P, Bauer-Sardina I, Saccone NL, Putzel J, Laitinen T, Cao A, Kere J, Pilia G, Rice JP, Kwok PY (2000) Juxtaposed regions of extensive and minimal linkage disequilibrium in human Xq25 and Xq28. Nat Genet 25:324–328 Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595
Tecott LH, Sun LM, Akana SF, Strack AM, Lowenstein DH, Dallman MF, Julius D (1995) Eating disorder and epilepsy in mice lacking 5-HT2c serotonin receptors. Nature 374:542–546 Watterson GA (1975) On the number of segregating sites in genetical models without recombination. Theor Popul Biol 7:256-76 Weiss KM, Clark AG (2002) Linkage disequilibrium and the mapping of complex human traits. Trends Genet 18:19–24 Westberg L, Bah J, Rastam M, Gillberg C, Wentz E, Melke J, Hellstrand M, Eriksson E (2002) Association between a polymorphism of the 5-HT2C receptor and weight loss in teenage girls. Neuropsychopharmacology 26:789–793 Vickers SP, Clifton PG, Dourish CT, Tecott LH (1999) Reduced satiating effect of d-fenfluramine in serotonin 5-HT(2C) receptor mutant mice. Psychopharmacology (Berl) 143:309–314 Vickers SP, Dourish CT, Kennett GA (2001) Evidence that hypophagia induced by d-fenfluramine and d-norfenfluramine in the rat is mediated by 5-HT2C receptors. Neuropharmacology 41:200–209 Vincent JB, Masellis M, Lawrence J, Choi V, Gurling HM, Parikh SV, Kennedy JL (1999) Genetic association analysis of serotonin system genes in bipolar affective disorder. Am J Psychiatry 156:136–138 von Meyenburg C, Langhans W, Hrupka BJ (2003) Evidence for a role of the 5-HT2C receptor in central lipopolysaccharide-, interleukin-1 beta-, and leptin-induced anorexia. Pharmacol Biochem Behav 74:1025–1031 Xie E, Zhu L, Zhao L, Chang LS (1996) The human serotonin 5HT2C receptor:complete cDNA, genomic structure, and alternatively spliced variant. Genomics 35:551–561 Yamada J, Sugimoto Y, Hirose H, Kajiwara Y (2003) Role of serotonergic mechanisms in leptin-induced suppression of milk intake in mice. Neurosci Lett 348:195–197 Yuan X, Yamada K, Ishiyama-Shigemoto S, Koyama W, Nonaka K (2000) Identification of polymorphic loci in the promoter region of the serotonin 5-HT2C receptor gene and their association with obesity and type II diabetes. Diabetologia 43:373–376 Zuker M (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 31:3406–3415