Polymorphism of SARS-CoV Genomes

Acta Genetica Sinica
Volume 33, Issue 4, April 2006, Pages 354-364

SHANG Lei, QI, Yan, BAO, Qi-Yu, TIAN, Wei, XU, Jian-Cheng, FENG, Ming-Guang, YANG, Huan-Ming

Abstract

In this work, severe acute respiratory syndrome associated coronavirus (SARS-CoV) genome BJ202 (AY864806) was completely sequenced. The genome was directly accessed from the stool sample of a patient in Beijing. Comparative genomics methods were used to analyze the sequence variations of 116 SARS-CoV genomes (including BJ202) available in the NCBI Gen-Bank. With the genome sequence of GZ02 as the reference, there were 41 polymorphic sites identified in BJ202 and a total of 278 polymorphic sites present in at least two of the 116 genomes. The distribution of the polymorphic sites was biased over the whole genome. Nearly half of the variations (50.4%, 140/278) clustered in the one third of the whole genome at the 3′ end (19.0 kb-29.7 kb). Regions encoding Orf10–11, Orf3/4, E, M and S protein had the highest mutation rates. A total of 15 PCR products (about 6.0 kb of the genome) including 11 fragments containing 12 known polymorphic sites and 4 fragments without identified polymorphic sites were cloned and sequenced. Results showed that 3 unique polymorphic sites of BJ202 (positions 13 804, 15 031 and 20 792) along with 3 other polymorphic sites (26 428, 26 477 and 27 243) all contained 2 kinds of nucleotides. It is interesting to find that position 18379 which has not been identified to be polymorphic in any of the other 115 published SARS-CoV genomes is actually a polymorphic site. The nucleotide composition of this site is A (8) to G (6). Among 116 SARS-CoV genomes, 18 types of deletions and 2 insertions were identified. Most of them were related to a 300 bp region (27 700–28 000) which encodes parts of the putative ORF9 and ORF10–11. A phylogenetic tree illustrating the divergence of whole BJ202 genome from 115 other completely sequenced SARS-CoVs was also constructed. BJ202 was phylogeneticly closer to BJ01 and LLJ-2004.

Key words

severe acute respiratory syndrome associated coronavirus (SARS-CoV), genome, polymorphism