Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

The genomes of all lungfish inform on genome expansion and tetrapod evolution

Abstract

The genomes of living lungfishes can inform on the molecular-developmental basis of the Devonian sarcopterygian fish–tetrapod transition. We de novo sequenced the genomes of the African (Protopterus annectens) and South American lungfishes (Lepidosiren paradoxa). The Lepidosiren genome (about 91 Gb, roughly 30 times the human genome) is the largest animal genome sequenced so far and more than twice the size of the Australian (Neoceratodus forsteri)1 and African2 lungfishes owing to enlarged intergenic regions and introns with high repeat content (about 90%). All lungfish genomes continue to expand as some transposable elements (TEs) are still active today. In particular, Lepidosiren’s genome grew extremely fast during the past 100 million years (Myr), adding the equivalent of one human genome every 10 Myr. This massive genome expansion seems to be related to a reduction of PIWI-interacting RNAs and C2H2 zinc-finger and Krüppel-associated box (KRAB)-domain protein genes that suppress TE expansions. Although TE abundance facilitates chromosomal rearrangements, lungfish chromosomes still conservatively reflect the ur-tetrapod karyotype. Neoceratodus’ limb-like fins still resemble those of their extinct relatives and remained phenotypically static for about 100 Myr. We show that the secondary loss of limb-like appendages in the LepidosirenProtopterus ancestor was probably due to loss of sonic hedgehog limb-specific enhancers.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Lungfish chromosomes help reconstruct the ur-tetrapod/vertebrate syntenic units.
Fig. 2: Genome and cell size evolution.
Fig. 3: Size distribution of clean reads of oxidized small RNA libraries from the three lungfish, amphibians and fish.
Fig. 4: Fin reduction in the Lepidosirenidae.

Similar content being viewed by others

Data availability

Genome assemblies and sequencing data are available from NCBI Bioprojects PRJNA808321, PRJNA808322, PRJNA813994, PRJNA813995 and PRJNA981572 and at BioSamples SAMN26083907 and SAMN26533844. Gene and repeat annotations are available at Figshare (https://figshare.com/articles/dataset/Lungfish_genome_annotation/24147732)102.

Code availability

Custom codes have been deposited at https://github.com/dukecomeback/lungfish, https://gitlab.mpi-cbg.de/assembly/programs/manualcurationhic, https://gitlab.mpi-cbg.de/assembly/programs/polishing and https://github.com/MartinPippel/DAmar.

References

  1. Meyer, A. et al. Giant lungfish genome elucidates the conquest of land by vertebrates. Nature 590, 284–289 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  2. Wang, K. et al. African lungfish genome sheds light on the vertebrate water-to-land transition. Cell 184, 1362–1376.e1318 (2021).

    Article  CAS  PubMed  Google Scholar 

  3. Irisarri, I. et al. Phylotranscriptomic consolidation of the jawed vertebrate timetree. Nat. Ecol. Evol. 1, 1370–1378 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Krefft, J. L. G. Description of a gigantic amphibian allied to the genus Lepidosiren from the Wide-Bay district, Queensland. Proc. Zool. Soc. Lond. 1870, 221–224 (1870).

    Google Scholar 

  5. Meyer, A. & Dolven, S. I. Molecules, fossils, and the origin of tetrapods. J. Mol. Evol. 35, 102–113 (1992).

    Article  ADS  CAS  PubMed  Google Scholar 

  6. Kemp, A. The biology of the Australian lungfish, Neoceratodus forsteri (Krefft 1870). J. Morphol. 190, 181–198 (1986).

    Article  Google Scholar 

  7. Nowoshilow, S. et al. The axolotl genome and the evolution of key tissue formation regulators. Nature 554, 50–55 (2018).

    Article  ADS  CAS  PubMed  Google Scholar 

  8. Shao, C. et al. The enormous repetitive Antarctic krill genome reveals environmental adaptations and population insights. Cell 186, 1279–1294.e1219 (2023).

    Article  CAS  PubMed  Google Scholar 

  9. Oliveira, C. et al. Chromosome formulae of neotropical freshwater fishes. Rev. Brasil. Genet. 11, 577–624 (1988).

    Google Scholar 

  10. Suzuki, A. & Yamanaka, K. Chromosomes of an African Lungfish, Protopterus annectens. Proc. Jpn Acad. B Phys. Biol. Sci. 64, 119–121 (1988).

    Article  ADS  Google Scholar 

  11. Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  12. Irisarri, I. & Meyer, A. The identification of the closest living relative(s) of tetrapods: phylogenomic lessons for resolving short ancient internodes. Syst. Biol. 65, 1057–1075 (2016).

    Article  PubMed  Google Scholar 

  13. Brownstein, C. D., Harrington, R. C. & Near, T. J. The biogeography of extant lungfishes traces the breakup of Gondwana. J. Biogeogr. 50, 1191–1198 (2023).

    Article  Google Scholar 

  14. Simakov, O. et al. Deeply conserved synteny resolves early events in vertebrate evolution. Nat. Ecol. Evol. 4, 820–830 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  15. Simakov, O. et al. Deeply conserved synteny and the evolution of metazoan chromosomes. Sci. Adv. 8, eabi5884 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  16. Muffato, M. et al. Reconstruction of hundreds of reference ancestral genomes across the eukaryotic kingdom. Nat. Ecol. Evol. 7, 355–366 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Bourque, G. et al. Ten things you should know about transposable elements. Genome Biol. 19, 199 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Meyer, A. & Schartl, M. Gene and genome duplications in vertebrates: the one-to-four (-to-eight in fish) rule and the evolution of novel gene functions. Curr. Opin. Cell Biol. 11, 699–704 (1999).

    Article  CAS  PubMed  Google Scholar 

  19. Thomson, K. S. An attempt to reconstruct evolutionary changes in the cellular DNA content of lungfish. J. Exp. Zool. 180, 363–371 (1972).

    Article  Google Scholar 

  20. Gregory, T. R. The bigger the C-value, the larger the cell: genome size and red blood cell size in vertebrates. Blood Cells Mol. Dis. 27, 830–843 (2001).

    Article  CAS  PubMed  Google Scholar 

  21. Nystedt, B. et al. The Norway spruce genome sequence and conifer genome evolution. Nature 497, 579–584 (2013).

    Article  ADS  CAS  PubMed  Google Scholar 

  22. Falcon, F., Tanaka, E. M. & Rodriguez-Terrones, D. Transposon waves at the water-to-land transition. Curr. Opin. Genet. Dev. 81, 102059 (2023).

    Article  CAS  PubMed  Google Scholar 

  23. Brennecke, J. et al. Discrete small RNA-generating loci as master regulators of transposon activity in Drosophila. Cell 128, 1089–1103 (2007).

    Article  CAS  PubMed  Google Scholar 

  24. Yi, M. et al. Rapid evolution of piRNA pathway in the teleost fish: implication for an adaptation to transposon diversity. Genome Biol. Evol. 6, 1393–1407 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Wang, J. et al. Transposable element and host silencing activity in gigantic genomes. Front. Cell Dev. Biol. 11, 1124374 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Song, J. et al. Variation in piRNA and transposable element content in strains of Drosophila melanogaster. Genome Biol. Evol. 6, 2786–2798 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Aravin, A. A. et al. A piRNA pathway primed by individual transposons is linked to de novo DNA methylation in mice. Mol. Cell 31, 785–799 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Wang, W. et al. The initial uridine of primary piRNAs does not create the tenth adenine that is the hallmark of secondary piRNAs. Mol. Cell 56, 708–716 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Pasquesi, G. I. M. et al. Vertebrate lineages exhibit diverse patterns of transposable element regulation and expression across tissues. Genome Biol. Evol. 12, 506–521 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Kofler, R. piRNA clusters need a minimum size to control transposable element invasions. Genome Biol. Evol. 12, 736–749 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Liu, X. et al. Transposable element expansion and low-level piRNA silencing in grasshoppers may cause genome gigantism. BMC Biol. 20, 243 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Yang, P., Wang, Y. & Macfarlan, T. S. The role of KRAB-ZFPs in transposable element repression and mammalian evolution. Trends Genet. 33, 871–881 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Imbeault, M., Helleboid, P.-Y. & Trono, D. KRAB zinc-finger proteins contribute to the evolution of gene regulatory networks. Nature 543, 550–554 (2017).

    Article  ADS  CAS  PubMed  Google Scholar 

  34. Kaessmann, H., Vinckenbosch, N. & Long, M. RNA-based gene duplication: mechanistic and evolutionary insights. Nat. Rev. Genet. 10, 19–31 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Carelli, F. N. et al. The life history of retrocopies illuminates the evolution of new mammalian genes. Genome Res. 26, 301–314 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Chen, M. et al. Evolutionary patterns of RNA-based duplication in non-mammalian chordates. PLoS ONE 6, e21466 (2011).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  37. Okabe, M. & Graham, A. The origin of the parathyroid gland. Proc. Natl Acad. Sci. USA 101, 17716–17719 (2004).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  38. Li, C. et al. Genome sequences reveal global dispersal routes and suggest convergent genetic adaptations in seahorse evolution. Nat. Commun. 12, 1094 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  39. Kerr, T. The scales of modern lungfish. Proc. Zool. Soc. Lond. 125, 335–345 (1955).

    Article  Google Scholar 

  40. Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

    Article  ADS  CAS  PubMed  Google Scholar 

  41. Di-Poï, N., Montoya-Burgos, J. I. & Duboule, D. Atypical relaxation of structural constraints in Hox gene clusters of the green anole lizard. Genome Res. 19, 602–610 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Feiner, N. Accumulation of transposable elements in Hox gene clusters during adaptive radiation of Anolis lizards. Proc. Biol. Sci. 283, 20161555 (2016).

    PubMed  PubMed Central  Google Scholar 

  43. Woltering, J. M., Noordermeer, D., Leleu, M. & Duboule, D. Conservation and divergence of regulatory strategies at Hox loci and the origin of tetrapod digits. PLoS Biol. 12, e1001773 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Berlivet, S. et al. Clustering of tissue-specific sub-TADs accompanies the regulation of HoxA genes in developing limbs. PLoS Genet. 9, e1004018 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Kemp, A., Cavin, L. & Guinot, G. Evolutionary history of lungfishes with a new phylogeny of post-Devonian genera. Palaeogeogr. Palaeoclimatol. Palaeoecol. 471, 209–219 (2017).

    Article  Google Scholar 

  46. Díaz-González, F. et al. Biallelic cGMP-dependent type II protein kinase gene (PRKG2) variants cause a novel acromesomelic dysplasia. J. Med. Genet. 59, 28–38 (2022).

    Article  PubMed  Google Scholar 

  47. Lewandowski, J. P. et al. Spatiotemporal regulation of GLI target genes in the mammalian limb bud. Dev. Biol. 406, 92–103 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Breslow, D. K. et al. A CRISPR-based screen for Hedgehog signaling provides insights into ciliary function and ciliopathies. Nat. Genet. 50, 460–471 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Yang, L. et al. Enlarged fins of Tibetan catfish provide new evidence of adaptation to high plateau. Sci. China Life Sci. 66, 1554–1568 (2023).

    Article  CAS  PubMed  Google Scholar 

  50. Letelier, J. et al. The Shh/Gli3 gene regulatory network precedes the origin of paired fins and reveals the deep homology between distal fins and digits. Proc. Natl Acad. Sci. USA 118, e2100575118 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Woltering, J. M. et al. Sarcopterygian fin ontogeny elucidates the origin of hands with digits. Sci. Adv. 6, eabc3510 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  52. Kvon, E. Z. et al. Comprehensive in vivo interrogation reveals phenotypic impact of human enhancer variants. Cell 180, 1262–1271.e1215 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Roscito, J. G. et al. Convergent and lineage-specific genomic differences in limb regulatory elements in limbless reptile lineages. Cell Rep. 38, 110280 (2022).

    Article  CAS  PubMed  Google Scholar 

  54. Ovchinnikov, V. et al. Caecilian genomes reveal the molecular basis of adaptation and convergent evolution of limblessness in snakes and caecilians. Mol. Biol. Evol. 40, msad102 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Lopez-Rios, J. The many lives of SHH in limb development and evolution. Semin. Cell Dev. Biol. 49, 116–124 (2016).

    Article  CAS  PubMed  Google Scholar 

  56. Farrell, E. R. & Münsterberg, A. E. csal1 is controlled by a combination of FGF and Wnt signals in developing limb buds. Dev. Biol. 225, 447–458 (2000).

    Article  CAS  PubMed  Google Scholar 

  57. Carneiro, J. et al. Evidence of cryptic speciation in South American lungfish. J. Zool. Syst. Evol. Res. 59, 760–771 (2021).

  58. Storer, J., Hubley, R., Rosen, J., Wheeler, T. J. & Smit, A. F. The Dfam community resource of transposable element families, sequence models, and genome annotations. Mob. DNA 12, 2 (2021); https://pubmed.ncbi.nlm.nih.gov/33436076/.

  59. Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci USA 117, 9451–9457 (2020); https://pubmed.ncbi.nlm.nih.gov/32300014/.

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  60. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999); https://pubmed.ncbi.nlm.nih.gov/9862982/.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Bao, Z., & Edyy, S. R. Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res. 12, 1269–1276 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21, i351–i358 (2005).

    Article  CAS  PubMed  Google Scholar 

  63. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinform. 10, 421 (2009).

  64. Chalopin, D., Naville, M., Plard, F., Galiana, D. & Volff, J.-N. Comparative analysis of transposable elements highlights mobilome diversity and evolution in vertebrates. Genome Biol. Evol. 7, 567–580 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Conte, M. A. et al. Chromosome-scale assemblies reveal the structural evolution of African cichlid genomes. Gigascience 8, giz030 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  66. Brawand, D. et al. The genomic substrate for adaptive radiation in African cichlid fish. Nature 513, 375–381 (2014).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  67. Kong, Y. et al. Transposable element expression in tumors is associated with immune infiltration and increased antigenicity. Nat. Commun. 10, 5228 (2019).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  68. Yang, W. R., Ardeljan, D., Pacyna, C. N., Payer, L. M. & Burns, K. H. SQuIRE reveals locus-specific regulation of interspersed repeat expression. Nucleic Acids Res. 47, e27 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Peona, V. et al. The avian W chromosome is a refugium for endogenous retroviruses with likely effects on female-biased mutational load and genetic incompatibilities. Philos. Trans. R. Soc. Lond. B Biol. Sci. 376, 20200186 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  70. Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).

    Article  CAS  PubMed  Google Scholar 

  71. Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinform. 9, 18 (2008).

    Article  Google Scholar 

  72. Steinbiss, S., Willhoeft, U., Gremme, G. & Kurtz, S. Fine-grained annotation and classification of de novo predicted LTR retrotransposons. Nucleic Acids Res. 37, 7002–7013 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Llorens, C. et al. The Gypsy Database (GyDB) of mobile genetic elements: release 2.0. Nucleic Acids Res. 39, D70–D74 (2011).

    Article  CAS  PubMed  Google Scholar 

  74. Groza, C., Chen, X., Wheeler, T. J., Bourque, G. & Goubert, C. GraffiTE: a unified framework to analyzetransposable element insertion polymorphisms using genome-graphs. Preprint at bioRxiv https://doi.org/10.1101/2023.09.11.557209 (2023).

  75. She, R., Chu, J. S., Wang, K., Pei, J. & Chen, N. GenBlastA: enabling BLAST to identify homologous gene sequences. Genome Res. 19, 143–149 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Pearson, W. R. Finding protein and nucleotide similarities with FASTA. Curr. Protoc. Bioinform. 53, 3.9.1–3.9.25 (2016).

    Article  Google Scholar 

  77. Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Sellitto, A. et al. Molecular and functional characterization of the somatic PIWIL1/piRNA pathway in colorectal cancer cells. Cells 8, 1390 (2019).

  79. Schmieder, R. & Edwards, R. Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864 (2011).

  80. Rosenkranz, D. & Zischler, H. proTRAC-a software for probabilistic piRNA cluster detection, visualization and analysis. BMC Bioinform. 13, 5 (2012).

    Article  Google Scholar 

  81. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinform. 10, 421 (2009).

    Article  Google Scholar 

  82. Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  83. Lartillot, N., Rodrigue, N., Stubbs, D. & Richer, J. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst. Biol. 62, 611–615 (2013).

    Article  CAS  PubMed  Google Scholar 

  84. Delsuc, F., Brinkmann, H., Chourrout, D. & Philippe, H. Tunicates and not cephalochordates are the closest living relatives of vertebrates. Nature 439, 965–968 (2006).

    Article  ADS  CAS  PubMed  Google Scholar 

  85. Revell, L. J. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 3, 217–223 (2012).

    Article  Google Scholar 

  86. Thomson, K. S. & Muraszko, K. Estimation of cell size and DNA content in fossil fishes and amphibians. J. Exp. Zool. 205, 315–320 (1978).

    Article  CAS  Google Scholar 

  87. Huang, Z. et al. Three amphioxus reference genomes reveal gene and chromosome evolution of chordates. Proc. Natl Acad. Sci. USA 120, e2201504120 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Kautt, A. F. et al. Contrasting signatures of genomic divergence during sympatric speciation. Nature 588, 106–111 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  89. Suyama, M., Torrents, D. & Bork, P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 34, W609–W612 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Castresana, J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552 (2000).

    Article  CAS  PubMed  Google Scholar 

  92. Huerta-Cepas, J., Serra, F. & Bork, P. ETE 3: reconstruction, analysis, and visualization of phylogenomic data. Mol. Biol. Evol. 33, 1635–1638 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).

  94. Deng, W., Nickle, D. C., Learn, G. H., Maust, B. & Mullins, J. I. ViroBLAST: a stand-alone BLAST web server for flexible queries of multiple databases and user’s datasets. Bioinformatics 23, 2334–2336 (2007).

    Article  CAS  PubMed  Google Scholar 

  95. Montavon, T. et al. A regulatory archipelago controls Hox genes transcription in digits. Cell 147, 1132–1145 (2011).

    Article  CAS  PubMed  Google Scholar 

  96. Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  97. Wang, Y. et al. The 3D Genome Browser: a web-based browser for visualizing 3D genome organization and long-range chromatin interactions. Genome Biol. 19, 151 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  98. Ramírez, F. et al. High-resolution TADs reveal DNA sequences underlying genome organization in flies. Nat. Commun. 9, 189 (2018).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  99. Taylor, W. & Van Dyke, G. Revised procedures for staining and clearing small fishes and other vertebrates for bone and cartilage study. Cybium 9, 107–119 (1985).

    Google Scholar 

  100. Kvon, E. Z. et al. Progressive loss of function in a limb enhancer during snake evolution. Cell 167, 633–642.e611 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. Osterwalder, M. et al. in Craniofacial Development Vol. 2403 (ed. Dworkin, S.) 147−186 (Humana, 2022).

  102. Du, K. Lungfish genome annotation. figshare https://doi.org/10.6084/m9.figshare.24147732.v1 (2024).

Download references

Acknowledgements

This work was supported by the German Research Foundation (DFG) through a grant to A.M., T. Burmester and M.S. (Me1725/24-1, Bu956/23-1, Scha408/16-1). O.S. was supported by the European Research Council’s Horizon 2020: European Union Research and Innovation Programme, grant no. 945026. Next-generation sequencing data production and data analysis were carried out at the DRESDEN-concept Genome Center, supported by the DFG Research Infrastructure Programme (project 407482635) and part of the Next Generation Sequencing Competence Network (project 423957469).

Author information

Authors and Affiliations

Authors

Contributions

A.M. and M.S. conceived the study and coordinated the work and, together with T. Burmester, secured the funding. Additional funding was provided by E. Myers. A.M. and M.S. wrote the manuscript with contributions from all other authors. S.W., M.P. and T. Brown performed high molecular weight DNA extraction, sequencing and genome assembly into contigs and Hi-C scaffolding. E. Myers supervised Hi-C and genomic sequencing, genome assembly and analysed data. P.F. undertook transcriptome analysis and annotation. K.D. performed the genome annotation and retrogene analysis. J.M.W. analysed and annotated hox clusters and performed gene loss analysis. I.S., L.O., E. Monteiro, D.B.A. and J.F.S. performed and analysed the lungfish treatment experiments. Z.C., S.J. and E.Z.K. analysed the L. paradoxa enhancer in mice. I.I. generated phylogenetic analyses and molecular clock and ancestral character state reconstructions. M.A. prepared the piRNAs for sequencing. S.K. performed positive selection analysis and analysed the piRNA landscapes. J.L., D.C. and A.S. performed transposon and repeat analyses. O.S. and M.L. performed synteny analyses.

Corresponding authors

Correspondence to Manfred Schartl or Axel Meyer.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Phylogenomics of lungfish.

a, Loci selection for phylogenomics. Graphs show different properties (root-to-tip variance, level of saturation, average patristic distance, compositional heterogeneity, proportion of variable sites, average bootstrap support, Robinson-Foulds similarity) for the 8,339 loci as inferred by genesortR. The graph of gene-wise log-likelihood differences shows support of each locus for two relevant alternative hypotheses (see Supplementary Information 2). b, Bayesian phylogram showing the evolutionary relationships and relative rates of the three lungfish genomes within the context of vertebrate phylogeny. The phylogeny was reconstructed as the consensus of 100 Markov chains (MCMC) from 100 independent gene jackknife replicates analyzed by PhyloBayes-MPI under the CAT mixture model (indicated with numbers on the internal edges, 1 = 100 replicates). The scale bar is the expected amino acid replacements per site. c. Bayesian time-calibrated phylogeny inferred from the set of 8,323 orthologs. Posterior probability distributions of estimated ages of common ancestors are plotted on tree nodes. X axis is in million years and major geological periods are indicated (O. Ordovician, S. Silurian, De. Devonian, Ca. Carboniferous, P. Permian, Tr. Triassic, Ju. Jurassic, Cr. Cretaceous, P. Paleogene, N. Neogene).

Extended Data Fig. 2 High retention of ancestral linkage groups lungfish genomes.

a-d, Species-to-species dotplots showing high degree of retained collinearity in the African and South American lungfish genomes, despite their genome size. b-d, Oxford dotplots representing orthologous genes shared on the previously reported ancestral linkage groups (ALGs)15. Chromosome numbering corresponds to the homologous lungfish linkage groups which have independently fused in individual lineages. Neoceratodus with its 27 chromosomes represented the most ancestral (unfused) state. e, Retention rates of lungfish chromosomes. Often only one alpha copy is present in lungfishes, e.g. descendants of several chromosomal elements have two alpha chromosomes in gar and Australian lungfish but only one clear alpha chromosome remains in South American and African lungfish (with the alpha copies having lost genes). Retention rates were computed as the percentage of the retained (present) ohnologs of gene families that comprise a given ancestral linkage group. Total number of gene families per chromosome was counted and their position was not taken into account. Only chromosomes with at least 5% ancestral linkage group retention were counted. Lower plots show retention on individual chromosomes (represented by dots) grouped by their ancestral linkage group in different lungfishes and gar.

Extended Data Fig. 3 Genomic composition of repetitive elements.

a, Overall composition of repetitive elements from unmasked assemblies (two rounds of transposable element annotation) for the three lungfish (Lpa=Lepidosiren paradoxa, Pan=Protopterus annectens, Nfo=Neoceratodus forsteri), axolotl (Ame=Ambystoma mexicanum), and coelacanth (Lch=Latimeria chalumnae). The total TE coverage for each species is shown under each pie chart. RC, rolling-circle transposon; SINE, short interspersed element; LINE, long interspersed element; LTR, long terminal repeat; DNA, cut-and-paste DNA transposons. Total repeat coverage of other species analyzed in this study: Xenopus ~25%; Platyfish ~23%; Burtoni and Midas cichlids ~30%; and Pufferfish ~8%. b, Different repeat superfamilies expanded in lungfish genomes. Heatmap shows the repeat superfamily content of coelacanth (Lch=Latimeria chalumnae), axolotl (Ame=Ambystoma mexicanum) and three lungfish (Lpa=Lepidosiren paradoxa, Pan=Protopterus annectens, Nfo=Neoceratodus forsteri). The color is scaled to the genomic content across repeat superfamilies.

Extended Data Fig. 4 Expression of transposable element families.

a, b, Expression estimated for each transposable element family from poly (A)-enriched RNA-seq data. In all tissues, SINEs are more highly expressed than any other subclass in the African lungfish, while both LINEs and SINEs are slightly more expressed than any other subclass in the South American lungfish. n = 2029 (African lungfish) and 1897 (South American lungfish) transposable element families. Wilcoxon Signed Ranks Test (one-sided) was applied with * indicating p-value < 0.05, ** p-value < 0.005, *** p-value < 0.0005 and **** p-value < 0.00005. The box bounds the interquartile range divided by the median value, with the whiskers extending to a maximum of 1.5 times the interquartile range beyond the box. c, d, Higher expression of young transposable element families. When transposable element families are divided into young or old copies based on Kimura 2-parameter distance to consensus values (0–10% is young, >10% is old), young TEs are significantly higher expressed than old ones, suggesting that several types of TEs remain active and contribute to the ongoing expansion of the lungfish genomes. Out of the 13 SINE families of Protopterus annectens, only copies from the SINE/t-RNA-V-RTE are considered as young. e, f, | Correlation between expression of transposable element families and copy number. Expression was estimated for each transposable element family using poly (A)-enriched RNA-seq data. For all tissues and transposable element classes, a positive correlation is observed between expression level and copy number. When a transposable element family is highly expressed, this family tends to have more copies. All analyzed correlations are significantly positive (p-values < 0.001). A linear model estimated trend line and calculated 95% confidence interval around the trend (gray fill) are plotted (two-sided). Lpa, Lepidosiren paradoxa; Pan, Protopterus annectens.

Extended Data Fig. 5 Age estimation and comparison of full-length TEs across lungfish genomes.

a, Landscape of subclasses of transposable elements. Kimura substitution level (%) for each copy against its consensus sequence used as proxy for expansion history of the transposable elements. Older copies accumulated more nucleotide substitutions and show higher distance to the consensus sequences. The phylogeny depicts the estimation of divergence times among the five studied species. RC, rolling-circle transposon; SINE, short interspersed element; LINE, long interspersed element; LTR, long terminal repeat. b, Copy numbers of full-length TEs within orders. c, Copy numbers of full-length TEs within superfamilies, color scaled to copy number. d, Percentage of transcribed TEs. e. Example of synteny to show one full-length copy from LINE/CR1 exclusively present in our Protopterus genome and absent in the other individual’s genome. f, Comparison of expression between full-length and fragmented TEs. n = 122, 832, 031 (South American lungfish), 66, 736, 976 (African lungfish) and 58, 296, 831 transposable elements. Wilcoxon Signed Ranks Test (one-sided) was applied with **** indicating p-value < 0.00005. The box bounds the interquartile range divided by the median value, with the whiskers extending to a maximum of 1.5 times the interquartile range beyond the box and the middle dots indicate mean values. Lpa=Lepidosiren paradoxa, South American lungfish; Pan=Protoperus annectens, African lungfish; Nfo=Neoceratodus fosteri; Australian lungfish.

Extended Data Fig. 6 Size distribution and correlation between piRNA content and genome size.

a, Size distribution of clean reads of unoxidized small RNA libraries of the same individuals as used for the piRNA analysis, with the position of the peaks for miRNA and piRNA marked with dotted lines. In contrast to the oxidized samples African and South American lungfish have a clear peak at the expected size range of miRNAs (~24 nts), but unlike the other species no second distinct peak at the expected size range of piRNAs. b, Spearman rank correlation between genome size (log scale) and %RNA of clean tag) from the oxidized testis small RNAs (silhouettes as in a).

Extended Data Fig. 7 Signature nucleotides of piRNAs, piRNA cluster structure and KZFP genes.

a, Proportion of nucleotides of the small RNA reads at the first position (left) and the tenth position (right) of the three lungfish, amphibian and fish samples. b, Graphical proTRAC output of a representative piRNA cluster for the pufferfish (left panel) and the South American lungfish (right panel). The top part visualizes the number of genomic hits produced by the query piRNA sequence. Dark green indicating that there is only one sequence hit in the genome, dark red indicating more than 1000 hits. Below is the sequence read coverage plot (blue: reads on the plus strand, red: reads on the minus strand). The RepeatMasker bar shows TEs annotated by RepeatMasker in this region. Lungfish clusters tend to have lower diversity and a higher read count. c, C2H2 zinc-finger and KRAB domain protein (KZFP) gene counts and genomic organization in sarcopterygians. Left, number of KZFP genes in indicated genomes. Right, gene length of KZFP genes in indicated species. n = 1168 KZFPs. Wilcoxon Signed Ranks Test (one-sided) was applied with **** indicating p-value < 0.00005. The box bounds the interquartile range divided by the median value, with the whiskers extending to a maximum of 1.5 times the interquartile range beyond the box. Lpa=Lepidosiren paradoxa; Pan=Protopterus annectens; Nfo=Neoceratodus forsteri; Lch=Latimeria chalumnae; Hsa=Homo sapiens; Gga=Gallus gallus.

Extended Data Fig. 8 Positively selected genes and gene losses.

a, Positively selected genes in all three lungfishes related to lungfish biology. b, Numerous gene losses in Lepidosiren paradoxa and Protopterus annectens indicate a cellular milieu that is permissive of transposon spreading due to a reduction in the DNA damage response and apoptosis. Due to low piRNA levels (through an as of yet unidentified mechanism) high activity of transposable elements is present in the germline resulting in frequent insertions and high levels of genotoxic stress due to double stranded DNA breaks which tend to result in G1 arrest and apoptosis as part of the DNA damage response which provides a mechanism for somatic selection against compromised cells. These gene losses are expected to reduce the levels of such selection and create a permissive environment for DNA transposition and helps explain the rapid expansion of the lungfishes’ genomes. c, The synteny block spanning RASGEF1B to ANTXR2 is widely preserved across vertebrates. The region containing RASGEF1B to PRDM8 has been deleted in Lepidosiren paradoxa and Protopterus annectens. The ciliary CFAP299 gene is still present in both species as an intronless retrogene. Loss of BMP3 can be linked to the reduced squamation of the derived Lepidoserenidae, while loss of PRKG2 and RASGEF1B can be linked to their derived fins. In the ray finned fish Astatotilapia burtoni, BMP3 is strongly expressed in the developing scales at 12 dpf. d, TTC23 is a component of the primary cilia and involved in the cellular perception of the shh signal transduction pathway. TTC23 is located in a highly conserved gene block which is also preserved in Lepidosiren paradoxa and Protopterus annectens, however without an identifiable TTC23 gene present. This “ghost locus” was further analyzed using Lagan Vista. Paired Lagan using the translated anchoring option and the Coelacanth sequence as baseline identifies the TTC23 exons in human, spotted gar and Neoceratodus forsteri, but not in Lepidosiren. paradoxa and Protopterus annectens.

Extended Data Fig. 9 Expanded hox clusters preserve regulatory landscape architecture.

a, In spite of a dramatic expansion of the lungfish Hox clusters whereby the Lepidosiren paradoxa clusters are approximately 20-fold enlarged compared to mouse, which is lower than the proportional difference in genome size. Consistent with this observation is that all four clusters preserve a conserved core subcluster (indicated in red) that has expanded relatively little and is low in repeat content. These regions are hoxa4-a11, hoxb2-b9, hoxc4-c11 and hoxd8-d11 indicating topological constraints on the expansion of these regions. In addition, hoxa3 and hoxd3 (purple) show expansion of their intronic region, which is similar to the expansion of the hoxa3 intron in the expanded axolotl Hoxa cluster7. An interesting difference is that the hoxa11-hoxa13 intergenic shows a tendency for expansion in lungfishes but not in axolotl, potentially related to additional constraints induced by the fin to limb transition. Furthermore, signatures of repeat insertion in the anterior Hoxc and posterior Hoxb clusters mirror those observed in anolis lizards41. b, HiC analysis for Midas cichlid, human and Protopterus annectens Hoxa and Hoxd clusters. Despite the approximate 70 times size difference between these species there is a remarkable conservation of the flanking regulatory landscapes whereby both clusters are present on the intersection of a 3’ and 5’ TAD. Known fin and limb enhancers (blue ovals) are conserved in an expected fashion (open ovals for Lepidosirenidae mm406 and e10 indicate secondary loss), altogether suggesting that long range regulatory landscapes remain preserved under conditions of genome expansion. Synteny regions shown encompass the following sizes: HoxA; Pan 3.2 Mb, Hsa 3.1 Mb Aci 0.31 Mb, Hoxd; Pan 28 Mb, Hsa 2.8 Mb, Aci 0.41 Mb. Species name abbreviations are the same as in the other figures.

Extended Data Fig. 10 Functional analysis of lungfishes ZRS and SAG treatment of Lepidosiren paradoxa regenerating fins.

a, Mouse transgenesis and LacZ staining for the Neoceratodus forsteri and Lepidosiren paradoxa ZRS sequences. Genotyping indicates whether insertion was either in a single or double copy at the targeted locus, or randomly integrated in the genome. Neoceratodus forsteri ZRS gives ZPA staining in 16/16 embryos, whereas the Lepidosiren paradoxa ZRS does not give staining in 15/15 embryos. b, Regeneration of pectoral fins in presence of the shh agonist SAG does not result in radial growth in Lepidosiren paradoxa (n = 3 for SAG treated animals, n = 3 for DMSO-treated animals; representative images of one animal per treatment are shown).

Supplementary information

Supplementary Information

Supplementary Information sections 1–6, including Tables 9–15, Fig. 1, appendix and references.

Reporting Summary

Supplementary Table 1

Statistics for assemblies and genome annotations of lungfishes.

Supplementary Table 2

Naming of lungfish chromosomes according to ancestral lungfish units.

Supplementary Table 3

piRNA sequencing statistics.

Supplementary Table 4

proTRAC result of TE mapping piRNAs.

Supplementary Table 5

a, presence of piRNA machinery genes in genomes of lungfish and other vertebrates. b, Expression of piRNA machinery genes in lungfish.

Supplementary Table 6

Retrocopy genes in the genomes of the South American and African lungfish and the coelacanth.

Supplementary Table 7

a, Positively selected genes site class 3. b, Positively selected genes site class 4.

Supplementary Table 8

Assemblies used for comparative analyses for positively selected genes, piRNA landscape and repeat content.

Supplementary Data 1

Peer Review File

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schartl, M., Woltering, J.M., Irisarri, I. et al. The genomes of all lungfish inform on genome expansion and tetrapod evolution. Nature 634, 96–103 (2024). https://doi.org/10.1038/s41586-024-07830-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-024-07830-1

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research