'How did the coronavirus evolve' is a simple question that has caused investigation, speculation, and downright conspiracies.
Right now, research shows that it's highly likely the virus responsible for COVID-19 evolved naturally – probably starting in bats, and then percolating innocently in an animal host, until it developed the necessary mutations to make it the global pandemic we see today.
A new study has given even more credence to this theory, finding a close relative of the SARS-CoV-2 virus in bats, including similar insertion events – mutations that are 'inserts' of genetic material into the viral genome - showing that such changes to the make-up of the virus can and do happen naturally.
"Since the discovery of SARS-CoV-2 there have been a number of unfounded suggestions that the virus has a laboratory origin," says Shandong First Medical University microbiologist Weifeng Shi.
"In particular, it has been proposed the S1/S2 insertion is highly unusual and perhaps indicative of laboratory manipulation. Our paper shows very clearly that these events occur naturally in wildlife. This provides strong evidence against SARS-CoV-2 being a laboratory escape."
This newly discovered bat coronavirus, which the team has called RmYN02, was identified during an analysis of 302 samples from 227 bats collected in Yunnan province, China, back in the second half of 2019.
After analysing the viruses found in these bat samples, the team were able to uncover two almost complete coronavirus genomes - RmYN01 and RmYN02.
RmYN01 only had a low match to SARS-CoV-2. But RmYN02 was something of a jackpot. This coronavirus shares 93.3 percent of its genome with SARS-CoV-2, and one particular gene called 1ab shares 97.2 percent – the closest match in that gene to date.
Then there are the insertion events. RmYN02 contains amino acid insertions at the point where the two subunits (S1 and S2) of its spike protein meet. SARS-CoV-2 also has S1 and S2 insertions - they're not the same amino acids in the two viruses, but it shows that these insertions can occur naturally, no lab required.
Despite the similarities, it doesn't mean RmYN02 is a direct ancestor to the virus causing COVID-19 around the world – especially considering the gene for the all-important receptor binding domain had a very low match to SARS-CoV-2, at only 61.3 percent.
But finding new coronavirus genomes is incredibly helpful if we want to discover how the SARS-CoV-2 virus evolved into what it is today.
"Our study reaffirms that bats, particularly those of the genus Rhinolophus [horseshoe bats], are important natural reservoirs for coronaviruses and currently harbor the closest relatives of SARS-CoV-2, although this picture may change with increased wildlife sampling," the team writes in their study.
"In this context it is striking that the RmYN02 virus identified here in Rhinolophus malayanus is the closest relative of SARS-CoV-2 in the long 1ab replicase gene, although the virus itself has a complex history of recombination."
The closest match we've found so far to SARS-CoV-2 is a bat coronavirus called RaTG13 – 96.1 percent of its RNA matches, but it's likely there are even closer viruses out there.
"Neither RaTG13 nor RmYN02 is the direct ancestor of SARS-CoV-2, because there is still an evolutionary gap between these viruses," Shi explains.
"Our study strongly suggests that sampling of more wildlife species will reveal viruses that are even more closely related to SARS-CoV-2 and perhaps even its direct ancestors, which will tell us a great deal about how this virus emerged in humans."
The research has been published in Current Biology.