Global SARS-CoV-2 genetic sequence distribution in early COVID-19
Global SARS-CoV-2 genetic sequence distribution in early COVID-19

Global SARS-CoV-2 genetic sequence distribution in early COVID-19

In a recent study published in the journal PLoS ONEResearchers evaluated the temporal spread and geographical distribution of genomic variants of severe acute respiratory syndrome coronavirus 2 (SARS-COV-2).

Examination: Worldwide SARS-CoV-2 haplotype distribution in early pandemic. Image credit: Fit Ztudio / Shutterstock

The pandemic with coronavirus disease 2019 (COVID-19) has markedly affected all nations across the globe and caused unprecedented morbidity and mortality. Since the release of the first SARS-COV-2 genetic sequence on January 5, 2020, the deleterious effects of its genetic clusters due to the presence of several mutations have been reported.

Understanding the genetic aspects and regional spread of SARS-COV-2 can help with the development of more effective and targeted vaccines. Thus, in this study, the researchers examined the temporal occurrence and geographical location of the multiple mutated SARS-COV-2 variants.

About the study

The study explored The National Center for Biotechnological Information (NCBI) and Global initiative on sharing all flu data (GISAID) databases between December 2019 and September 2020, from which 77,648 viral genomes were identified.

Data including their geographical distribution, sampling date and length of the genetic sequences were obtained. Only 75,401 genomes with sequences over 29,000 nucleotides were analyzed. 53 variants present in more than 1,000 genetic sequences were classified into clays, and their haplotypes were counted. Only nations with variants over 50 sequences were selected for the analysis.

Results and discussion

Using the earliest SARS-CoV-2 sequence identified in Wuhan as a reference, the authors observed 26,539 mutations of several types such as missense (57%), synonyms (28%), insertions / deletions (7%) and stop variants (2% ), of which 4% and 3% were present in the 3 ‘and 5’ untranslated regions (UTR), giving rise to five to nine mutant variants. These variants with similar geographical spread were grouped into four genetic plates.

A majority (58%) of the sequences were sourced from Europe, particularly from the United Kingdom, followed by Oceania, North America, and the smallest in Asia. Most mutated variants were observed in France and Italy, whereas most unmutated variants were present in China, followed by the United States, Northern Europe, Australia, South Africa, Brazil, Canada, and India.

Geographical distribution of the minimum number of variants per  sequence. distribution of the minimum number of variants per sequence.

The most frequently reported and highly infectious p.Asp614Gly mutant variant with a missense mutation in its S glycoprotein was observed in combination with other three variants: p. Pro4715Leu, c.-25C> T and p.Phe924Phe located in the 5 ‘untranslated region (UTR) of the ORF1ab gene. These variants were grouped in the first blot, observed in 55,582 (74%) genetic sequences.

They were reported earliest by the UK in European countries in January 2020, and they were also observed in Italy (93%), northern (73%) and southern parts (77%) of America, Oceania (73%), Africa (77% ). This genetic clade produced two subclades, IA and IB observed mainly in Oceania and North America, respectively.

Frequency and geographical distribution of subclasses 1A and B. and geographical distribution of subclasses 1A and B.

The second plate consisting of two varieties was present in North America, Spain, Asia (China, Thailand and Kazakhstan), Australia and Africa (Nigeria and Ghana). The third and fourth drafts were mainly observed in Europe. Singaporean and Australian genetic sequences consisted of five and six variants, respectively, whereas Kazakhstan and Spain demonstrated the presence of four variants in their genetic sequences.

The four genetic clays formed the center of 1213 haplotypes. The first, second, and third haplotypes were most commonly observed in South America and Europe, mainly spreading to North America, Africa, and Europe.

Several genetic variants were localized to specific countries, for example the 313 genetic variants located in Japan and the variants 29829 and 18877 limited to Saudi Arabia. The genetic variant p.Asn501Tyr was observed in the United States and Brazil in early April and had spread to Australia by June 2020.


Based on current study results, the authors concluded that several SARS-CoV-2 mutant variants exist globally, mainly grouped into four plates with specific geographical location, representative of the spatial and temporal spread of the mutant strains.

Another observation pointed out by the researchers was that synergistic and concomitant effects of genetic mutations could provide significant clinical benefits to the viral variants in terms of high viral transmission rates and infectivity by increased binding to angiotensinogenic conversion enzyme 2 (ACE2) receptors on human cells.

The results of the study highlight the stronghold of the highly infectious SARS-CoV-2 virus across the globe. Therefore, strategies and vaccines formulated to target these genetic mutations may improve the standard of care for COVID-19 patients. However, future studies with more uniform geographical representation across nations and correlations between the genetic variability and the clinical manifestations of COVID-19 are needed to devise more effective therapeutic policies.

Journal reference:

  • Cairo A, Iorio MV, Spena S, Tagliabue E, Peyvandi F (2022) Worldwide SARS-CoV-2 haplotype distribution in an early pandemic. PLoS ONE 17 (2): e0263705, DOI: journal. pone.0263705,

Leave a Reply

Your email address will not be published.