Indonesia finished whole-genome sequencing of the SARS-CoV-2 virus circulating in the country. Virus samples from Indonesian patients indicate they are part of the dominant group globally.
TWO months and two days after announcing its first case of the Coronavirus Disease 2019 (Covid-19) on March 2, Indonesia has finally completed whole-genome sequencing of the SARS-CoV-2 virus. The Eijkman Institute for Molecular Biology submitted three complete genome sequences of the virus from Indonesian patients to the Global Initiative on Sharing All Influenza Data (GISAID) on May 4. “Almost all available staff and facilities were occupied with Covid-19 detection. But, we managed to recruit several individuals for the sequencing,” said Eijkman Institute Director Amin Soebandrio, on May 5.
Sequencing is the process of determining the order of nucleotide bases in Deoxyribonucleic Acid (DNA) or Ribonucleic Acid (RNA) molecules. DNA molecule consists of random sequences of four types of nucleotides: cytosine (C), guanine (G), adenine (A), and thymine (T). The sequence of these bases determine the instruction or genetic code of the DNA. As for RNA, uracil (U) is found instead of thymine (T). The SARS-CoV-2 is an RNA virus.
Genome of Wuhan-Hu-1/2019, the first SARS-CoV-2 to be discovered in the world, was sequenced by a team of Chinese scientist led by Yong-Zheng Zhang from Fudan University School of Public Health, Shanghai. Data of the virus, whose complete genome consists of 29,881 base pairs, were submitted to GenBank and GISAID on January 12. Samples of the Wuhan-Hu-1/2019 was obtained from a 41-year-old patient admitted to the Central Hospital of Wuhan on December 26, 2019, with complaints of fever, chest tightness, and cough.
Whole genomes of the three SARS-CoV-2 viruses from Indonesia also consist of around 29,000 base pairs. Based on an ongoing analysis by the GISAID, Amin said that the three viruses, each named JKT-EIJK0141/2020, JKT-EIJK0317/2020, and JKT-EIJK2444/2020, do not match either one of the three types of coronaviruses (S, G, and V) grouped by the GISAID. “There is a possibility that they belong to a new group from Southeast Asia. However, they may also be categorized as members of existing groups after a more thorough analysis,” said Amin when contacted on May 10.
The phylogenetic tree of SARS-Cov-2 from Indonesia generated by Nextstrain.org./TEMPO
Following the submission of four additional whole-genome sequences on May 9, Amin said that his institution has sent a total of seven data of the virus to the GISAID. The four viruses are named JKT-EIJK01/2020, JKT-EIJK02/2020, JKT-EIJK03/2020, and JKT-EIJK04/2020. “They are all from Jakarta. We chose the highest viral loads so that we could immediately begin sequencing without having to culture the virus first,” he said. Amin added that his institution also attempted to obtain sequences from other institutions because they needed data from outside of Jakarta.
The only institution other than the Eijkman Institute which has completed sequencing SARS-CoV-2 virus from Indonesian patients is the Airlangga University’s Institute of Tropical Disease (ITD), Surabaya in East Java. ITD Director Maria Inge Lusida said to The Jakarta Post that her institution has completed six whole-genome sequencing of the virus. “Four isolated sequences of the virus we have analyzed are closer to the Chinese clade and two are closer to the European clade,” said Maria. She refused when Tempo asked her for confirmation. “Later, when there is additional information,” she said via WhatsApp on May 18.
ITD has sent two whole-virus-genome sequences to the GISAID. They are EJ-ITD3590NT/2020 and EJ-ITD853Sp/2020. Four others, according to Maria, are in the finishing stage in the laboratory. “Hopefully all will go well,” said Maria when asked whether her institution is planning to submit more virus data to GISAID. In total, nine whole sequences of the SARS-CoV-2 virus from Indonesia have been submitted to GISAID. As of Thursday, May 28, GISAID’s database contained 34,000 virus data from 81 countries.
Amin said that there are three benefits from having complete genome sequence of the SARS-CoV-2 virus in Indonesia. “First is that we get to know which virus from other countries it is most closely related with. Second, we can trace the pattern of its movement from one city to others,” said Amin. “Third, the sequences may be used to design diagnostic tools or vaccine sensitive to the the virus circulating in Indonesia,” said Amin, who is a professor of clinical microbiology at the University of Indonesia.
To find out which type the SARS-CoV-2 virus in Indonesia is categorized into, and which virus from other countries it is most closely related with, a virologist from Surya University, Tangerang, Banten, Sidrotun Naim, suggested to access the Nextstrain.org website. “Referring to Nexstrain categorization, there are only type A and B. All the virus data in Indonesia fits into type or clade A,” said Sidrotun via WhatsApp on May 28. “Assuming it is not further edited. The data from ITD was subjected to numerous editing a week after they were submitted.”
Sidrotun said that, from the unrooted phylogenetic tree generated by Nextstrain, type A is currently the dominant clade globally. “Type B is only represented by those in 1 o’clock direction. Type B is the early version in Wuhan, which is contained after the lockdown,” she said. “The ones which managed to exit the country prior to the lockdown then mutated and adapted into type A,” said Sidrotun, who holds a Ph.D degree in environmental microbiology from the University of Arizona, the United States.
A cell infected by the SARS-CoV-2 will release millions of new viruses. Each of the virus carries a copy of the original genome. When the cell replicates this genome, sometimes an error occurs, which usually affects single bases. This error is known as mutation. Upon transmission from one person to another, the coronavirus randomly accumulates more and more mutation.
SARS-CoV-2 Virus from Indonesia/TEMPO
Nextstrain is an open-source project founded by Trevor Bredford and his colleagues to harness pathogen genome data. Its goal is to aid epidemiological understanding and improve outbreak response. Nextstrain provides a continually-updated view of publicly available data alongside analytic and visualization tools for use by the community. Meanwhile, GISAID only provides access to members. Nextstrain uses part of the data from GISAID, but the visualization can only handle around 3,000 data.
Nextstrain reveals that six out of seven virus data from patients in Jakarta submitted by the Eijkman institution, along with one virus sent by ITD, EJ-ITD835Sp, are all in the A7 clade group. Another from ITD, EJ-ITD3590NT, belongs in the A2a clade. “The eight strains being grouped together indicate a dominant strain, at least in Jakarta. Or, they could also have originated from patients from the same cluster,” said Sidrotun.
Sidrotun also told of another possibility, in which seemingly unrelated cases, or those occurring in different locations, are later found out to share similarities after contact tracing. “This means viruses circulating in the different places originated from the same common root,” she said. “For example, a person gets infected in Surabaya, then one of that person’s close contacts goes to Jakarta.”