|
Report of the Anthropology / Human Genetic Diversity Component March 23, 2001 Dear Colleagues, The IHWG has experienced a few changes and new developments in the last three months, and we are excited to share these developments with you as the Anthro Component begins its second year of activity. New projects have been added to the Workshop, and there are new options for Anthropology Component Participants to consider. The Workshop is now able to accept data generated with SSP and older SSOP systems for incorporation into the Central Database. In addition, we are pleased to provide you with a general overview of the status of our project, as well as an updated timeline for the next fifteen months, which will take us right up to the Workshop and Conference in May of 2002. Finally, the data submission phase of the project has started, and we want to share the guidelines for the submission of demographic and other background data with you. Anthro Component Project Status As of March 12, 2001, the Anthro Component is comprised of 63 participating laboratories, each of which is at a different stage in the overall project. Of the 63 labs that have indicated interest in the Project, 15 have passed the QC evaluations and are generating data, and 20 are still carrying out their QC typings. One lab has submitted new data to the database, and we expect more data to come in as we approach the end of March. Of the remaining 28 labs, half are in contact with the Anthropology Team, while the other half have only indicated their interest in participating. However, both of these numbers shrink on a weekly basis as new laboratories order QC cells and QC reagents. If you have not ordered your QC cells and reagents, please take a look at the schedule of deadlines below. The time to complete QC evaluations is growing short. Virtual DNA Analysis Project A new project has been added to the IHWG. The goal of the Virtual DNA Analysis (VDA) project is to develop a software resource that will permit the evaluation of HLA genotypes generated via different typing methods in a single context of current allelic diversity. The VDA project is directed by Wolfgang Helmberg, who has developed the Sequence Compilation and Rearrangement Evaluation (SCORE) software, as described in Helmberg W. et al. (1998) Tissue Antigens, 51, 587. The program takes raw probe and primer reactivity data and consults a database of sequence specific probe and primer specificities for different typing systems in order to generate an allele call. Because the program maintains the probe and primer specificities in the context of the current allelic diversity, instead of the level of allele diversity known at the time the typing system was developed, all of the genotypes in the system can be evaluated in a standardized fashion. The SCORE software is also capable of incorporating sequence data. The Anthropology Component would like to take advantage of the VDA project by accepting non-IHWG typing data that is submitted using SCORE. Previously, non-IHWG data could only be submitted for analysis as Available Data in the form of allele calls, which cannot be effectively re-evaluated in a current context. By using the SCORE software, we will be able to accept both pre-existing SSP and SSOP data, in the form of probe and primer reactivities, and will be able to evaluate these types alongside the SSOP and RLS data generated using IHWG reagents. For example, if you submitted genotypes generated using the 12th Workshop SSOP reagents, those genotypes will not include alleles that have been identified subsequent to 12th Workshop. However, those genotypes are included in the allele set for 13th IHWG reagents. Using the SCORE software, we will be able to re-evaluate the 12th Workshop data using a 13th Workshop allele set. The SCORE software can also be used to enter 13th IHWG SSOP typing data. Currently, the only other option is to use a set of large, unwieldy excel spreadsheets, which are difficult for the Central Database to incorporate. Like the pattern interpretation software provided for RLS reagents, the SCORE software automatically formats the data for submission to the IHWG Central Database. All the user has to do is enter the probe reactivities, and then email the output file to the database. The key to the success of the VDA project, and to the benefits we hope to derive from the VDA project as the Anthro Component, lies in the submission of the sequence motifs detected by the probes and primers of each typing system. Currently, the SCORE software incorporates this information for more than 100 typing systems, developed by a wide variety of laboratories and companies. If your lab has developed its own typing system, you will need to enter the recognized sequence motifs into the program yourself, type the IHWG SSOP Reference Cell Panel using your typing method and submit the raw typing results using SCORE. However, this only has to be done once, and your data will become interchangeable with other typing systems. We know that many labs have already generated class II data using SSOP and SSP methods, and we hope that you will decide to submit this data to the Workshop using the SCORE software. Much more information on the Virtual DNA Analysis project is available on the VDA web page of the IHWG web site at, http://www.ihwg.org/components/vda/vda.htm. Biostatistical Analyses In July of 2001, the Anthropology Component and Biostatistics Core will begin the analysis of genotyping data, which will be presented at the Workshop and Conference in May 2002. For each dataset, these analyses will proceed in four phases, as described below. Phase One analyses will involve the review and validation of genotyping data. Validation will be carried out by testing for departure from Hardy-Weinberg proportions (HWP) and by estimating haplotype frequencies (HF). In the cases where the population (a) deviates from HWP or (b) carries haplotypes which are unexpected for that geographic region, the data will be reviewed and the submitting labs contacted. This analysis does not aim to eliminate any data. The goal is simply to verify whether deviation from HWP and HF results can be explained by typing or input error, before proceeding to the next phases of the analyses. These analyses will use the raw genotype data as well as demographic data for the samples and populations (see below). It will be imperative to provide this demographic and background data in order for these first-level analyses to be completed. Phase Two analyses will be statistical and population genetic tests of population variation. These will include measures of genetic diversity and levels of variation, conformity to expected Hardy-Weinberg equilibrium proportions, tests of neutrality of allele and haplotype frequency distributions, and levels of linkage disequilibrium. Phase Three analyses will consist of meta-analyses of the results obtained in the Phase Two analyses. These meta-analyses will quantify results across the multiple populations and loci surveyed, searching for sharing of patterns of variation across populations and loci. Phase Four analyses will consist of interpretations of the results obtained in the second and third phases of analysis. We will use the results from individual populations and the comparisons between them to address questions of Anthropological and evolutionary interest. Anthro/Biostatistics 2001/2002 Timetable After discussion with the Biostatistics Core, we have developed an updated deadline schedule for the next fifteen months. The goal of this schedule is to facilitate the greatest degree of data analysis in the time remaining before the Workshop and Conference. The deadline for a complete analysis, involving all four levels of data consideration, is September 30, 2001. If this deadline cannot be met, data will still be analyzed as in the first and second phases, but we will not be able to guarantee that the data will be included in the meta-analyses and subsequent inferences. The deadline for a partial analysis, involving at least the first and second phases of data consideration, is December 31, 2001. All data will eventually undergo complete analysis, as will all data that are submitted in 2002, but additional analyses will not be begun until after the Workshop and Conference in May. We strongly encourage that you submit your data between now and July 2001, and no later than September 30. This will allow us to include it in the complete analysis. As the data submission deadlines draw near, we will need to focus on data validation analysis rather than QC evaluations. As a result, the QC evaluation process will close on July 1, 2001. RLS QC reagents will not be sent out after that time. Please submit your QC typings before July 1. If you intend to participate in the workshop by generating data, please order your QC cell panel cells and QC typing reagents in the next month. If you have data generated using an SSP system or an older SSOP system, please request the SCORE software and begin the process of registering your typing system as soon as possible. RLS HLA-C QC Evaluations After reviewing the HLA-C locus QC data submitted from several labs, we have concluded that probe 14 on the HLA-C strip has been the source of much of the difficulty experienced carrying out HLA-C QC typings. Because probe 14 is indicated as a consistently faint probe on the Faint Allele table for HLA-C, and because it has been consistently miscalled as a false negative during QC evaluations, we have decided to exclude probe 14 from consideration during the evaluation of QC results at the HLA-C locus. Please note that Probe 14 will be included during the data generation phase of the Workshop. Therefore, it is imperative that you carefully consider this probe when entering your data using the pattern interpretation software. ANY signal at probe 14, no matter how faint, should be entered as positive during future Workshop typing. We will include a photograph on the Anthropology web pages to illustrate the 'faint' nature of this probe, and hope that this will help during future data interpretation. Please remember that the cells of the QC cell panel are unrelated, and that as such, the QC cell panel is a useful model for practicing probe calls as well as for typing wide variations in allelic combinations. However, when typing populations, the relatedness between population samples will likely provide contextual consistencies that will make it easier to correctly call potentially ambiguous probes. Even as the Anthro Component's QC phase comes to a close, we will continue to provide troubleshooting assistance based on the typing experiences of the participating labs. In addition, we welcome and encourage any questions you may have concerning probe ambiguities, in both the QC and the population genotyping phases of this project. We want each participating Anthropology Component lab to have the greatest success possible when it comes to generating accurate data, and we appreciate whatever feedback you can provide. Entry of Demographic and Background Data for Populations and Samples Without accurate background data (including data describing demographic information, ethnicity, collection methods, etc.) on populations and samples, the Biostatistics Core will be unable to carry out useful analyses of the data. The genotyping data has an important counterpart in the demographic and background data for each population and sample. We are working closely with the Central Database to develop Excel spreadsheets and documentation, and we anticipate that these materials will be ready within the next two months. These data will be submitted by entering them into an Excel spreadsheet, as outlined in the documentation, and then emailing them to the central database, where they will be coordinated with the typing data. In anticipation of the availability of these data submission materials, we suggest that you review whatever background information you have as well as any other information on the collection and nature of the samples that you have available. When the data submission forms and documentation are available for use, they will appear on the IHWG web site, on the Central Database pages. Updates to Available Data submission forms The data form for the submission of Available Data (genotypes generated using non-IHWG reagents) available on the IHWG web site has been updated in three ways. Fields for the HLA-DRA and HLA-DPA1 loci have been added, and the order of the locus fields has been changed to reflect the order of the loci on Chromosome 6. The fields are now in the order A, C, B, DRA, DRB1, DQA1, DQB1, DPA1, DPB1. We hope that most laboratories will choose to submit Available Data via the SCORE program. Data submitted using this program will be of greater use to the HLA community in the long run, but if you have typing data and cannot submit it using SCORE, please use the Available Data form on the IHWG web site at, http://www.ihwg.org/shared/database.htm#Anthropology. Human Subjects Use Authorizations Please remember that data cannot be accepted to the Central Database if you have not submitted of a Letter of Certification indicating that you have Authorization for Human Subjects Research. A copy of this letter can be found at either http://www.ihwg.org/components/IRBLetter.pdf, Anthro Component Meeting at EFI At the end of this month, the various IHWG projects and components will be meeting over the course of the last two days at the annual EFI meeting in Granada, Spain. The purpose of these meetings is to share preparations and plans for the Workshop and Conference in 2002, as well as to discuss project aims, progress, preliminary data and technical issues directly with project leaders. The Anthropology / Human Diversity Component will be meeting from 11:30 - 13:30 PM on Thursday, March 29th in Seminarios 3 and 4, Level 1. We will discuss the current aims and scope of the Anthro Component, as well as possible additions to the project. In addition, we will present some preliminary analyses of the data which has been submitted, as well as the results of the QC typings using the Reverse Line Strip system, and will demonstrate the pattern interpretation software distributed for use with the strips. We are looking forward to seeing you in Grenada! ![]() Henry Erlich, Steve Mack, and Laura Geyer Anthropology / Human Genetic Diversity Component Henry A. Erlich, Ph.D., Chair and Steven J. Mack, Ph.D. ROCHE MOLECULAR SYSTEMS CHILDREN'S HOSPITAL OAKLAND RESEARCH INSTITUTE |