Insect genomes, by the numbers. New #openaccess paper w/ @paulbfrandsen and @joannalkelley.
Fun facts:
- 536 species w/ nuclear genomes
- Aquatic insects are WAY underrepresented
- 29 terrestrial insects to chromosome-level, 0 aquatic
More below...
https://www.mdpi.com/2075-4450/11/9/601
Fun facts:
- 536 species w/ nuclear genomes
- Aquatic insects are WAY underrepresented
- 29 terrestrial insects to chromosome-level, 0 aquatic
More below...

Have a look at the taxonomic spread yourself here. The complete data set is also included in Supplemental as an Excel file (with accession numbers!) for easy sorting/exploring. Data are as of July 2020.
To the aquatic insects point (our title and major takeaway): given the numbers of described species, there should be 1 aquatic insect genome for every 9 terrestrial. Instead there are 23:1 terrestrial to aquatic. We knew there was a big disparity, but didn't expect that!
In terms of contiguity, mean contig N50 for terrestrial genomes (N=485): ~933 kbp. For aquatic insects (N=20), just 259 kbp. Moreover, there are only *2* aquatic insect genomes that exceeded our "highly contiguous" (N50 > 1 Mbp) threshold. 29 for terrestrial insects!
Three orders (unsurprisingly) make up the bulk of insect genomes: Diptera (131), Hymenoptera (130), and Lepidoptera (107). Surprised to see so many beetles (Coleoptera, N=32) and Hemiptera (N=43) though!
12 orders don't have any genomic representation in the insect Tree of Life:
- Protura
- Diplura
- Zygentoma
- Zoraptera
- Mantophasmatodea
- Grylloblattodea
- Embioptera
- Mantodea
- Raphidoptera
- Megaloptera
- Neuroptera
- Mecoptera
- Protura
- Diplura
- Zygentoma
- Zoraptera
- Mantophasmatodea
- Grylloblattodea
- Embioptera
- Mantodea
- Raphidoptera
- Megaloptera
- Neuroptera
- Mecoptera
Only 4 insect orders have chromosome-level genome assemblies to leverage:
- Hemiptera (lice, etc.): N=2
- Hymenoptera (bees, etc.): N=8
- Lepidoptera (butterflies, etc.): N=4
- Diptera (true flies); N=15
Aquatic insects? Not a single one.
- Hemiptera (lice, etc.): N=2
- Hymenoptera (bees, etc.): N=8
- Lepidoptera (butterflies, etc.): N=4
- Diptera (true flies); N=15
Aquatic insects? Not a single one.
So what does this mean? We need to fill in some gaps! While efforts like @Arthropod_i5K are awesome and doing key work, developing high-quality genomic resources for underrepresented groups would go a long way. With new tech ( @PacBio HiFi, etc.), it's easier than ever!