Alu Repeat Diversity in the Human Genome
Renzo Balboa0, Simon Easteal0, Hardip Patel0
(0) Australian National University
Find me on Wed Nov 25th, 1:30-2:50pm AEDT in Remo, table 102
Abstract
More than half of the human genome is comprised of repeat sequences. Alu elements are primate-specific ~300bp repeat sequences considered to be the most successful transposable elements in the human genome. In any individual, it is believed that there are ~1.1 million copies of Alu repeats comprising ~11% of the human genome. Their higher activity (~1:20 births) and high polymorphism rates contribute to human diversity. Alu elements have also been implicated in genome regulation where for example, Alus contribute up to ~30% of methylation sites in the human genome, and due to its activity, have been implicated in genetic instability and disease. Therefore, a comprehensive understanding of Alu polymorphisms is required for understanding their contribution in genome regulation and in health and disease. We have developed an assembly-based approach to comprehensively annotate Alu variations at an individual and population level using whole genome sequence data from 273 individuals from the Simons Genome Diversity project encompassing 128 global populations. We find that Alu elements form the largest proportion of annotated structural variation (~15% of all structural variants) in the human genome. We detect ~1.3 million Alu across all individuals, where ~99% of all detected Alu elements overlap with the reference genome, and ~93% of all reference Alu elements are detected in our analyses. Our approach can confidently describe polymorphisms for ~700 non-reference Alu regions per individual; these elements are largely individual and population-specific. A greater understanding of Alu polymorphisms in humans will reveal wider patterns of human diversity and will pave the way for understanding their roles in DNA regulation and in health and disease.
Comments