인간 유전질환과 관련된 모자이크 변이의 검출법
Detection of Mosaic Sequence Variants Associated with Human Genetic Diseases
서울대학교병원 임상유전체의학과1, 서울대학교 의과대학 서울대학교병원 진단검사의학과2, 서울대학교 의과대학 암연구소3
Department of Genomic Medicine1, Seoul National University Hospital, Seoul; Department of Laboratory Medicine2, Seoul National University Hospital, Seoul National University College of Medicine, Seoul; Cancer Research Institute3, Seoul National University College of Medicine, Seoul, KoreaCorrespondence to:
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Lab Med Online 2023; 13(4): 282-289
Published October 1, 2023
Copyright © The Korean Society for Laboratory Medicine.
Mosaicism is a term used to describe the coexistence of genetically distinct normal and abnormal cells in a single individual originating from a single fertilized egg . For example, germline mosaicism refers to the presence of genetic variants in different cells of germline cells of an organisms, which are the cells that give rise to eggs or sperm. Hence, some of the offspring of an individual may inherit a different genetic variant than others, and the genetic diversity among the offspring can be greater than that among the cells of the body of an individual. Somatic mosaicism, in contrast, refers to the presence of genetic variants in different cells of the body but not in the germline (Fig. 1) .
Figure 1. Distribution of mutant cells in the human body and different types of mosaicism in particular individuals. In somatic or germline mosaicism, mutant cells may appear with different mosaicism ratios in distinct tissues of patients, including gonads. (A) Example of somatic mosaicism confined to endoderm among the three germinal cell layers. (B) In focal cortical dysplasia, somatic mosaicism appears localized to a specific organ, such as the brain. (C) Germline or gonadal mosaicism refers to genetic variation in the genomes of germline cells within an individual. (D) Somatic and germline mosaicism may coexist in the same individual.
Replicating human DNA, consisting of 3 billion base pairs per haploid, is a complex process that is susceptible to errors, despite various proofreading mechanisms in cells . The process of differentiation, which involves rapid DNA replication, is especially prone to induction of mutations. In the scenario where a mutation arises during the differentiation process following fertilization, the process results in somatic mosaicism. This phenomenon impacts specific tissues or cells within the organism, rather than its entirety. This localized mosaicism can result in genetic disorders with symptoms manifesting in specific tissues or organs.
With the advent of high-throughput next-generation sequencing (NGS) in 2005, genome analysis has significantly progressed, enabling the detection of mosaicism and enhancing our comprehension of mosaic disorders. However, even healthy donors can exhibit mosaic variation, with mutant allele fractions ranging from 1.0% to 29.7% within organ samples . Therefore, there is an increasing demand for effective strategies to detect relevant mosaicism in clinical laboratories. This review aims to provide a comprehensive overview of the most effective methods for detecting clinically relevant mosaic sequence variants, focusing on target diseases, analytical techniques, and sample types.
Mosaic diseases are a group of rare disorders characterized by abnormalities, often resulting from mutations in critical signaling pathways . These mutations would be fatal if present in all cells. However, when limited to a subset of cells, they can cause partial overgrowth, brain malformations, skin symptoms, and vascular malformations [6, 7]. While each clinical evidence is uncommon, collectively, they form a distinct group of disorders. Mendelian disorders exhibit mosaicism, and approximately 1.5% of the patients with various diseases are subjected to exome sequencing demonstrated mosaicism . Somatic mosaicism has been extensively examined in neurodevelopmental diseases, including autism, developmental delay, intellectual disability, and epilepsy [9-11]. Exome sequencing studies suggest that approximately 3–5% of the autism [12-14] and 1% of the developmental disabilities are caused by somatic mosaicism . Therefore, clinical laboratories should consider the possibility of mosaicism in these representative diseases (Table 1).
The mosaic form of neurofibromatosis type 1 (NF1), a type of hereditary cancer, has been attributed to related changes in the
A recent review investigated the prevalence and types of mosaic sequence variants in patients with tuberous sclerosis complex (TSC) . A total of 39 patients with TSC with mosaic variants were subjected to massively parallel sequencing, which involved analyzing 170 different tissue samples. The milder and distinct clinical phenotype observed in patients with the mosaic form of TSC (variant allele frequency, VAF: 0–10%, median 1.7% in blood DNA) was in contrast to the findings for other TSC studies, with similar occurrences of facial angiofibromas (92%), and kidney angiomyolipomas (83%), but fewer seizures, cortical tubers, and multiple other manifestations (
The identification of genetic mosaicism holds importance in establishing a diagnosis, evaluating the risk of recurrence, and delivering precise genetic counseling. Particularly in relation to prenatal counseling, it will be important whether the corresponding mosaicism exists in the gonadal tissue .
DETECTION OF MOSAIC SEQUENCE VARIANTS
1. Applied methods for detecting mosaic sequence variants: Wet experiment
Studies investigating mosaicism in diseases with a relatively high incidence of such genetic changes, such as those in
Until recently, studies of somatic mosaicism were limited to specific genes for particular disorders. However, with the advent of NGS, it is now possible to analyze somatic mosaicism in tens to hundreds of genes simultaneously, or even at the genome level. Nevertheless, conventional NGS analysis typically proceeds at a depth of approximately 100X–200X, which can identify mosaic cells when their ratio to normal cells exceeds 10%. Below this threshold, differentiation from analytical errors becomes challenging .
While conventional NGS analysis has provided valuable information about cancer genetics, detecting rare variants with a low allele fraction is challenging due to errors in nucleotide changes attributed to sequencing and PCR. To accurately identify rare variants and eliminate false positives, molecular barcodes can help filter out errors introduced during NGS. By comparing read replicates with identical barcodes, genuine rare variants can be detected, and false positives can be removed.
The safe sequencing system (SafeSeqS) method is a well-known example of tag-based error correction in NGS and uses PCR primers that degenerate sequence tails to attach tags . After a few PCR cycles with the tagged primers, a second round of universal primers is used for further amplification, creating multiple copies of each founding molecule. These copies are then grouped into families for consensus-based error correction. Although the random tags pose challenges for PCR multiplexing and sequencing large numbers of targets simultaneously, newer variations of the method incorporating hairpin designs have partially alleviated this issue. Alternatively, single-molecule molecular inversion probes (smMIPs) can tag targets without the risk of double-tagging the same molecule . This method involves ligating a single oligonucleotide with two targeting arms and a molecular barcode to form tagged, closed-loop products, which can be enriched, amplified, and sequenced. Although designing smMIPs can be challenging due to constraints around the proximity window for the targeting arms, advancements in software algorithms have enhanced the feasibility of this technique .
Each single strand can be uniquely marked by ligating UMI-tailed adapters to a library. However, this method lacks the means to relate the consensus of one strand to its mate for comparison, which can cause early PCR errors to go unnoticed . Researchers, therefore, used a newly developed dual-molecular barcode technology called Ion AmpliSeq HD to analyze somatic mutations in 24 samples from 12 patients with biliary-pancreatic and non-small cell lung cancers, including cell-free DNA in plasma. The dual molecular barcode sequencing technologies enabled the detection of low VAF values (as low as 0.17%) .
Nevertheless, these barcoding techniques have the drawback of generating redundant reads of non-target sequences, increasing sequencing costs. Therefore, researchers recently proposed a barcode-free NGS error validation method to address this issue. In this method, since the DNA clones of erroneous reads are physically extracted, true variants with a frequency surpassing 0.003% can be detected .
Researchers used a combination of amplicon-based NGS, droplet digital PCR, and blocker displacement amplification to validate 102 candidate mosaic variants. Of these variants, 27 (26.4%) were confirmed to have low (VAF between 1% and 10%) or very low (VAF <1%) levels of mosaicism. The computational pipeline can accurately distinguish true from false-positive mosaic variants and efficiently detect low-level mosaicism in exome sequencing samples. In addition, the authors found that the presence of two or more alternate reads in the parental sample is a reliable indicator of low-level somatic mosaicism . Table 2 summarizes the aforementioned methods.
2. Applied methods for detecting mosaic sequence variants: Bioinformatic tools
When conducting mosaicism analysis utilizing NGS data in a non-clinical laboratory setting, the following algorithm can be taken into account. The popular somatic mutation detection tools such as MuTect2 (, https://gatk.broadinstitute.org/hc/en-us/articles/360037593851-Mutect2) and VarScan2 (, http://dkoboldt.github.io/varscan/) can be applied.
Detecting non-cancer mosaic variants is difficult due to the limited number of non-clonal mosaic variants. In a recent study, a new tool called DeepMosaic (, https://github.com/Virginiaxu/DeepMosaic) was introduced, which combines an image-based visualization module for single nucleotide mosaic variants and a convolutional neural network-based classification module for control-independent mosaic variant detection. DeepMosaic was trained using 180,000 simulated or experimentally assessed mosaic variants and was evaluated using 619,740 simulated mosaic variants and 530 biologically tested mosaic variants from 16 genomes and 181 exomes. The results showed that DeepMosaic outperformed the existing methods, with a sensitivity of 0.78, a specificity of 0.83, and a positive predictive value of 0.96 on non-cancer whole-genome sequencing data. Additionally, DeepMosaic doubled the validation rate over previous best-practice methods on non-cancer whole-exome sequencing data (0.43 versus 0.18). These findings demonstrate that DeepMosaic is a reliable and accurate classifier for non-cancer mosaic variants and can be used in combination with or as an alternative to existing methods.
The MosaicForecast (https://github.com/parklab/MosaicForecast/) technique uses machine learning and read-based phasing to identify mosaic single-nucleotide variants and indels accurately, and it outperforms current algorithms by a significant margin (difference of several folds) in terms of specificity . MosaicForecast achieves this by substantially enhancing the detection of mosaic SNVs and indels from reference-free sequencing data. Incorporating various read-level features into a nonlinear classifier is essential for distinguishing real mosaic mutations from germline variants and artifacts, mainly from CNV/repeat regions. In addition, MosaicForecast uses so-called phasable sites to construct a highly reliable training set of mosaic variants
3. Examined specimens influence the mosaicism detection rate
Establishing a method for detecting mosaicism requires carefully selecting a suitable sample. One of the major challenges in detecting mosaicism is identifying a tissue sample that harbors the mutation of interest. In the majority of human genetic research and diagnostic studies, peripheral blood DNA is the preferred genetic material source due to its accessibility and ease of isolation. However, blood cells are not a stable source of genetic material due to their tendency to undergo multiple rounds of self-renewal during hematopoiesis, which can reduce the diversity of clonal lineages with age. To overcome this challenge, other sources of DNA can be used, and multiple sources can be examined. For example, ectodermal tissues can be obtained from buccal brushings or hair root bulbs, mesodermal tissues from blood or saliva, and endodermal origin DNA from urothelial cells collected in urine samples. For males, sperm samples may be particularly informative for recurrence risk assessment . In addition, the levels of mosaicism can vary across tissues and body locations, even within the same embryonic lineage, due to mutation timing, cell migration and determination during development, and tissue-specific cell-autonomous selective effects.
CONCLUSION AND FUTURE PERSPECTIVES
Somatic mosaicism is a genetic phenomenon that can lead to various diseases. Therefore, clinical laboratories must establish appropriate specimens and methods for accurate diagnosis. With the increasing availability of advanced diagnostic techniques, the demand to develop effective diagnostic strategies to identify somatic mosaic mutations in various specimens is increasing. Additionally, the diagnosis of somatic mosaic diseases requires a deep understanding of the underlying mechanisms that generate mosaic mutations and the corresponding pathophysiology of the disease. Finally, laboratories must maintain high quality control standards to ensure accurate and reliable results. Overall, it is important for clinical laboratories to establish a comprehensive diagnostic strategy suitable for the specific somatic mosaic disease in question and to employ appropriate techniques and specimens to ensure a successful diagnosis.
- Thorpe J, Osei-Owusu IA, Avigdor BE, Tupler R, Pevsner J. Mosaicism in human health and disease. Annu Rev Genet 2020;54:487-510.
- Freed D, Stevens EL, Pevsner J. Somatic mosaicism in the human genome. Genes (Basel) 2014;5:1064-94.
- Campbell IM, Shaw CA, Stankiewicz P, Lupski JR. Somatic mosaicism: implications for disease and transmission genetics. Trends Genet 2015;31:382-92.
- Huang AY, Xu X, Ye AY, Wu Q, Yan L, Zhao B, et al. Postzygotic single-nucleotide mosaicisms in whole-genome sequences of clinically unremarkable individuals. Cell Res 2014;24:1311-27.
- Lim YH, Moscato Z, Choate KA. Mosaicism in cutaneous disorders. Annu Rev Genet 2017;51:123-41.
- D'Gama AM, Walsh CA. Somatic mosaicism and neurodevelopmental disease. Nat Neurosci 2018;21:1504-14.
- Kurek KC, Luks VL, Ayturk UM, Alomari AI, Fishman SJ, Spencer SA, et al. Somatic mosaic activating mutations in
PIK3CAcause CLOVES syndrome. Am J Hum Genet 2012;90:1108-15.
- Cao Y, Tokita MJ, Chen ES, Ghosh R, Chen T, Feng Y, et al. A clinical survey of mosaic single nucleotide variants in disease-causing genes detected by exome sequencing. Genome Med 2019;11:48.
- D'Gama AM, Woodworth MB, Hossain AA, Bizzotto S, Hatem NE, LaCoursiere CM, et al. Somatic mutations activating the mTOR pathway in dorsal telencephalic progenitors cause a continuum of cortical dysplasias. Cell Rep 2017;21:3754-66.
- Lim JS, Gopalappa R, Kim SH, Ramakrishna S, Lee M, Kim WI, et al. Somatic mutations in TSC1 and TSC2 cause focal cortical dysplasia. Am J Hum Genet 2017;100:454-72.
- Perez D, Hsieh DT, Rohena L. Somatic mosaicism of
PCDH19in a male with early infantile epileptic encephalopathy and review of the literature. Am J Med Genet A 2017;173:1625-30.
- Freed D, Pevsner J. The contribution of mosaic variants to autism spectrum disorder. PLoS Genet 2016;12:e1006245.
- Krupp DR, Barnard RA, Duffourd Y, Evans SA, Mulqueen RM, Bernier R, et al. Exonic mosaic mutations contribute risk for autism spectrum disorder. Am J Hum Genet 2017;101:369-90.
- Dou Y, Yang X, Li Z, Wang S, Zhang Z, Ye AY, et al. Postzygotic single-nucleotide mosaicisms contribute to the etiology of autism spectrum disorder and autistic traits and the origin of mutations. Hum Mutat 2017;38:1002-13.
- Wright CF, Prigmore E, Rajan D, Handsaker J, McRae J, Kaplanis J, et al. Clinically-relevant postzygotic mosaicism in parents and children with developmental disorders in trio exome sequencing data. Nat Commun 2019;10:2985.
- García-Romero MT, Parkin P, Lara-Corrales I. Mosaic neurofibromatosis type 1: a systematic review. Pediatr Dermatol 2016;33:9-17.
- Giannikou K, Lasseter KD, Grevelink JM, Tyburczy ME, Dies KA, Zhu Z, et al. Low-level mosaicism in tuberous sclerosis complex: prevalence, clinical features, and risk of disease transmission. Genet Med 2019;21:2639-43.
- Cook CB, Armstrong L, Boerkoel CF, Clarke LA, du Souich C, Demos MK, et al. Somatic mosaicism detected by genome-wide sequencing in 500 parent-child trios with suspected genetic disease: clinical and genetic counseling implications. Cold Spring Harb Mol Case Stud 2021;7:a006125.
- Aretz S, Stienen D, Friedrichs N, Stemmler S, Uhlhaas S, Rahner N, et al. Somatic APC mosaicism: a frequent cause of familial adenomatous polyposis (FAP). Hum Mutat 2007;28:985-92.
- Hes FJ, Nielsen M, Bik EC, Konvalinka D, Wijnen JT, Bakker E, et al. Somatic
APCmosaicism: an underestimated cause of polyposis coli. Gut 2008;57:71-6.
- Petrackova A, Vasinek M, Sedlarikova L, Dyskova T, Schneiderova P, Novosad T, et al. Standardization of sequencing coverage depth in NGS: recommendation for detection of clonal and subclonal mutations in cancer diagnostics. Front Oncol 2019;9:851.
- Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B. Detection and quantification of rare mutations with massively parallel sequencing. Proc Natl Acad Sci U S A 2011;108:9530-5.
- Hiatt JB, Pritchard CC, Salipante SJ, O'Roak BJ, Shendure J. Single molecule molecular inversion probes for targeted, high-accuracy detection of low-frequency variation. Genome Res 2013;23:843-54.
- Boyle EA, O'Roak BJ, Martin BK, Kumar A, Shendure J. MIPgen: optimized modeling and design of molecular inversion probes for targeted resequencing. Bioinformatics 2014;30:2670-2.
- MacConaill LE, Burns RT, Nag A, Coleman HA, Slevin MK, Giorda K, et al. Unique, dual-indexed sequencing adapters with UMIs effectively eliminate index cross-talk and significantly improve sensitivity of massively parallel sequencing. BMC Genomics 2018;19:30.
- Hirotsu Y, Otake S, Ohyama H, Amemiya K, Higuchi R, Oyama T, et al. Dual-molecular barcode sequencing detects rare variants in tumor and cell free DNA in plasma. Sci Rep 2020;10:3391.
- Yeom H, Lee Y, Ryu T, Noh J, Lee AC, Lee HB, et al. Barcode-free next-generation sequencing error validation for ultra-rare variant detection. Nat Commun 2019;10:977.
- Gambin T, Liu Q, Karolak JA, Grochowski CM, Xie NG, Wu LR, et al. Low-level parental somatic mosaic SNVs in exomes from a large cohort of trios with diverse suspected Mendelian conditions. Genet Med 2020;22:1768-76.
- Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 2013;31:213-9.
- Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 2012;22:568-76.
- Yang X, Xu X, Breuss MW, Antaki D, Ball LL, Chung C, et al. Control-independent mosaic single nucleotide variant detection with DeepMosaic. Nat Biotechnol 2023;41:870-7.
- Dou Y, Kwon M, Rodin RE, Cortés-Ciriano I, Doan R, Luquette LJ, et al. Accurate detection of mosaic variants in sequencing data without matched controls. Nat Biotechnol 2020;38:314-9.
- Goriely A, Lord H, Lim J, Johnson D, Lester T, Firth HV, et al. Germline and somatic mosaicism for
FGFR2mutation in the mother of a child with Crouzon syndrome: implications for genetic testing in "paternal age-effect" syndromes. Am J Med Genet A 2010;152A:2067-73.
- Hyland VJ, Robertson SP, Flanagan S, Savarirayan R, Roscioli T, Masel J, et al. Somatic and germline mosaicism for a R248C missense mutation in
FGFR3, resulting in a skeletal dysplasia distinct from thanatophoric dysplasia. Am J Med Genet A 2003;120A:157-68.
- Taylor SA, Deugau KV, Lillicrap DP. Somatic mosaicism and female-to-female transmission in a kindred with hemophilia B (factor IX deficiency). Proc Natl Acad Sci U S A 1991;88:39-42.
- Costa JM, Vidaud D, Laurendeau I, Vidaud M, Fressinaud E, Moisan JP, et al. Somatic mosaicism and compound heterozygosity in female hemophilia B. Blood 2000;96:1585-7.
- Leuer M, Oldenburg J, Lavergne JM, Ludwig M, Fregin A, Eigel A, et al. Somatic mosaicism in hemophilia A: a fairly common event. Am J Hum Genet 2001;69:75-87.
- Lemmers RJ, van der Wielen MJ, Bakker E, Padberg GW, Frants RR, van der Maarel SM. Somatic mosaicism in FSHD often goes undetected. Ann Neurol 2004;55:845-50.
- Buzhov BT, Lemmers RJ, Tournev I, van der Wielen MJ, Ishpekova B, Petkov R, et al. Recurrent somatic mosaicism for D4Z4 contractions in a family with facioscapulohumeral muscular dystrophy. Neuromuscul Disord 2005;15:471-5.
- Lim JS, Kim WI, Kang HC, Kim SH, Park AH, Park EK, et al. Brain somatic mutations in
MTORcause focal cortical dysplasia type II leading to intractable epilepsy. Nat Med 2015;21:395-400.
- Møller RS, Weckhuysen S, Chipaux M, Marsan E, Taly V, Bebin EM, et al. Germline and somatic mutations in the
MTORgene in focal cortical dysplasia and epilepsy. Neurol Genet 2016;2:e118.
- Lalonde E, Ebrahimzadeh J, Rafferty K, Richards-Yutz J, Grant R, Toorens E, et al. Molecular diagnosis of somatic overgrowth conditions: a single-center experience. Mol Genet Genomic Med 2019;7:e536.
- Lindhurst MJ, Sapp JC, Teer JK, Johnston JJ, Finn EM, Peters K, et al. A mosaic activating mutation in
AKT1associated with the Proteus syndrome. N Engl J Med 2011;365:611-9.
- Wieland I, Tinschert S, Zenker M. High-level somatic mosaicism of
AKT1c.49G>A mutation in skin scrapings from epidermal nevi enables non-invasive molecular diagnosis in patients with Proteus syndrome. Am J Med Genet A 2013;161A:889-91.
- Hildebrand MS, Harvey AS, Malone S, Damiano JA, Do H, Ye Z, et al. Somatic
GNAQmutation in the forme frusteof Sturge-Weber syndrome. Neurol Genet 2018;4:e236.
- Tyburczy ME, Dies KA, Glass J, Camposano S, Chekaluk Y, Thorner AR, et al. Mosaic and intronic mutations in
TSC1/TSC2explain the majority of TSC patients with no mutation identified by conventional testing. PLoS Genet 2015;11:e1005637.