Download Otter for your meeting notes

TWiV774 | Otter.ai

TWiV774

Jim Duehr1h 44min

Speaker 1

00:00

This Week in virology, the podcast about viruses, the kind that make you sick from microbe TV, this is twig this week in virology, Episode 774, recorded on June 29 2021. I'm Vincent dragon yellow. And you're listening to the podcast all about viruses. Joining me today from Ann Arbor, Michigan, Kathy Spindler.

Speaker 2

00:31

Everybody here, it's 87 Fahrenheit. And they say that feels like 99 Fahrenheit. And it's supposed to storm in the next little bit, so I might lose me, I don't know. And that's the same as 34 Celsius. And feels like 37 Celsius. So pretty hot.

Speaker 1

00:50

Yeah, it's pretty hot here in New York City. 35 Celsius, we have a heat advisory. Let's see what it is in the rest of the country. Coming back from Tulane University, Bob, Gary.

Speaker 3

01:03

Anson, good to be here again, is 29 degrees C here in New Orleans. And it's a little bit of precipitation coming down. Not too bad. Not

Speaker 1

01:12

bad. Alright. And joining us for the first time from Scripps Research, Christian Anderson. Welcome to 12.

Speaker 4

01:21

Thank you, Vincent. Hi, and thanks for having me on. Yeah. And being in San Diego. 20 degrees C 70. f. But it's actually a little cloudy right now, which I'm pretty excited about. I suspect they'll be sunny within the hour.

Speaker 2

01:34

So you're in the season of night and morning low clouds, right? That's right. It's the June Gloom.

Speaker 1

01:43

All right, we have a mission today. And that's why Bob and Christian are here. We're going to talk about a manuscript that made a lot of noise. I don't know last week, I think or the week before when it was released. by Jesse bloom, it's called the recovery of deleted deep sequencing data sheds more light on the early Wu Han stars COVID to epidemic preprint at bio archive. And Bob and tristian have agreed to come on and help us unpack this. But before we do that, Christian, your first time on to a bunch of give us a little bit of your history, where you're from where you got trained and so forth.

Speaker 4

02:26

Sure, yeah. So I'm from Denmark, and I did my undergraduate degree in Denmark, in molecular biology in office, so not Copenhagen, the second biggest city in Denmark, and then moved on to do my PhD in immunology and at Cambridge University and the MRC Laboratory of Molecular Biology, doing a lot of mouse work and understanding immunity with a focus on T cells, sort of got a little jaded with that, and wanted to do something a little more exciting. So I hopped over to the US, Boston, did my postdoc at Harvard with putty subedi also where I met Bob Gary, of course, we're working on infectious diseases with a strong focus on evolution, genomic epidemiology, understanding transmission and spread and evolution of these viruses. And now I'm a professor here at Scripps Research have been for about six years now, I think. And our focus has been been on viruses, as well. And again, a lot of us work together with with Bob.

Speaker 1

03:28

Yeah, as I told you, yesterday, I visited Oh, whose man they taught me how to pronounce it. It's hard. I was getting it all wrong in the and I think October 2019. And I guess you knew some of the people that had on I did a Twitter there was it was really fun.

Speaker 4

03:45

Yeah, it's, it's, it's a great place. And I still still keep the connection back to all who serve some good colleagues and friends who were there for sure.

Speaker 1

03:53

So I remember I flew I flew in. I flew to Copenhagen, and then I took a train. And then my host, was a graduate student. And she had heard I was going to Copenhagen for a meeting, I think European Society of Clinical virology. And so she said, Come to Oh, who's for a seminar? So I did, but on the plane, I was reading in the magazine, this picturesque Street and oh, who's that all these pretty houses. So I said, I took a picture of it on my phone and I gave it to her when I arrived. I said, I want to see this. So the next morning she took me there's this tiny street of little houses, you must know it right. It's beautiful.

Speaker 4

04:33

Yeah, it's a lot of character. And in parts of that, that city for sure. Great. I mean, it's near the forests and the ocean. So yeah, forests and trees is something I'm missing here in San Diego for sure. We need models.

Speaker 1

04:47

So I went to dinner with some students and they say this is the best place in the world to live. Including I think one of them was from Israel. He said this is the best place in the world and my host said except to get start pretty early in the winter. You have to do With that, right?

Speaker 4

05:02

Well, if you like sunshine, I'll say San Diego is probably better than Mike. But yeah, it's it's a good place. And science is really a big thing in our house. And I'll say I go go home every once in a while these days, and the places exploded. It's much bigger now. And then when I left, which is while many years ago, by now, it's almost 15 years ago,

Unknown Speaker

05:25

I think there's a Nobel laureate out of Oh, who's right?

Speaker 4

05:28

Yeah, there. There are a few because you asked me now and I'm blank, and they shouldn't but but yes, that has been a long history of, especially membrane proteins actually, is one of the big things first first, physiological aspect, which is what the Nobel Prize was, and then more recently around structures and understanding their function and both neuro neurology but also more recently in infectious diseases, of course, now, I

Speaker 1

05:55

noticed when when COVID hit that, oh, this was mentioned often as helping in a screening or sequencing, whatever, which was quite interesting, right?

Speaker 4

06:06

Yeah. I mean, well, and Denmark, in general, I mean, I'll say in terms of rapid screening, Denmark was actually screening almost 10% of its population daily, which I think is one of the main reasons why they were quite successful at keeping numbers low, despite being pretty much fully open. And also, importantly, their genomic surveillance program. They're sequencing most cases these days, which really give us a very sort of clear lock of evolution variants. And all these things are very, very impressive. And it's small operations. It's it's a single University. It's the single lab doing most of the genomic sequencing for the whole of, of Denmark, which is really impressive.

Speaker 1

06:49

Yeah, I met someone at another meeting, actually, from the state serum Institute. Right. And I guess it's in Copenhagen. Right. And he got involved with the mink outbreak. Sending I can't remember his name. You probably know the guy on his on his phones, go on farms go Yes, exactly. Yeah. Yeah. And he gave me his book. He wrote a book called, it's just a virus and it's in. It's in Danish, of course, but it's got a funny picture of him on the cover and some kind of, you know, high containment suit, but he has regular shoes.

Speaker 4

07:27

Yeah, there's an extended, there's an article with him that I saw just a couple of days ago, where he's sitting in his full suit. But then on the phone in his office, that doesn't really make sense. I guess it was a good photo. Yeah. Well, I

Speaker 1

07:41

tell you when the mink outbreak happened, I emailed him. And he immediately answered and he gave data, he said, we're going to publish daily updates is very forthcoming. It was really good.

Speaker 4

07:52

Yeah, yeah. And I'll say like, actually, my understanding, emergence and origins of SARS, cov. Two, I would say, the old mink stories that we have seen both in Denmark, but also in the Netherlands, where clearly this virus is very generalists that's capable of jumping right into a main population burning through that population, right back out to two people, again, with very little evolution at all. And there's a single mutation that we see pop up frequently, but but it's fully capable of doing that. Which is, which is, I mean, there's other viruses that can do that. But But I will say South calf to certainly has a unique ability to do that, and with a lot of different animals make included.

Speaker 1

08:33

What's the story with the mink in Denmark at the moment that they quell the outbreaks? Are they still having the farms or what?

Speaker 4

08:40

No, like they call the entire population? Wow. I think there might be some small scale operations. But I mean, yeah, I mean, it was chaotic, right. And it wasn't done optimally no question about it, but having this sort of parallel epidemic going on, in in animals is of course a risk both in terms of viral evolution, but also spill back into humans from from the mink population. And I will say one of the things that I was certainly concerned about too, is going from mink to other farmed animals, mink farms, often together with with other farmed animals. And you can imagine South Cove to getting established not just in the main population, but also in other farm animals and maybe even rodents and other other animals, which will of course would be be a big problem in terms of keeping numbers low eradication, vaccines strategies are there while making Denmark. I believe there's some wild mink there's also escaped mink. But of course Denmark is not Denmark was the largest producer of mink prior to the pandemic. Of course, there are many mink farms here in the US for example two is that there's in China as well and in China, Yes, absolutely.

Speaker 1

10:02

I'm just gonna, I'm curious to see how many wild animals end up being infected. I assume people are doing surveillance. I know. Tony shouts, Colorado was interested in the deer mouse population in the wild. Right. But I wonder what else right?

Speaker 4

10:19

Yeah, it's I mean, they've done their papers showing that the the variance of concern, probably because of that info, one way mutation and possibly others, but are capable of infecting rodents, right, which just sort of vanilla one size cup tube is not capable of so I mean, the host range is clearly broad. The question is, if it's capable of establishing itself in some of these animal populations is certainly concerning as they also just spill back into two baths. Is that even possible? I'm sure Tony would have some good insights on this. Probably not super high risk, but but certainly as the pandemic continues to, to be ongoing. So is one of the things that we need to be be aware of. Yeah.

Speaker 1

11:07

I'm sure we'll find out in the coming years. Absolutely. All right. So before sorry, is COVID to Christian, what were you working on mainly with viruses. So,

Speaker 4

11:20

I mean, most of our work has focused on on lassa fever and loss of virus. And that's a lot of the work with with Bob, of course, Bob has worked with, you know, our collaborators and colleagues in Sierra Leone for many years, and also colleagues in Nigeria and other places in West Africa. So lassa has been and continues to be a strong focus of both of us and our collaborative projects, which which are huge. Ebola, of course, is something that we got, we were right in the middle of it when when that started back in notes detected back in West Africa in March of 2014. So a strong focus on Ebola back then. And this is continues to be a focus today. Zika as that hit the Americas and our focus was mostly on how did they get to the US, I was spreading in the US How did emerge. And then more recently, also West Nile has been been a focus of ours, which is an endemic virus here in the United States. We're working with a lot of Public Health Partners, vector control labs all across the country here. also working with academic institutions to understand not these outbreak viruses, because much of our work has been on outbreak viruses, but on getting better assessments of evolution spread transmission, what are the drivers on a endemic virus? That's a reason then demick buyers, right, as you well know, came into New York sitios was first detected there in 1999. and has since spread all across the country. So

Speaker 1

12:53

that's, we have to get you back when Dixon diploma is here and talk about West Nile. That would be fun, because he he wrote a little book about West Nile us called West Nile story, I think it was. And he loves to tell stories, this actually that the Ebola leads me to ask you. So there is a current outbreak which I think the viruses was very similar to the the last one of five years ago that seems to have persisted in someone. What do we know about that?

Speaker 4

13:24

Yeah, that's. So that's the reason. reemergence. And Bob knows this well to win in Guinea, which is a continuation of the the outbreak there the epidemic in West Africa, which the last cases were described in 2016. And what they managed to do why, and this is work done in West Africa sequence to virus. And what they realized is that it had a very low molecular clock, it connected all back to that mokona variant which caused the epidemic in West Africa. And and really importantly, had this one single mutation, the 82 v mutation, which is something that developed during the West African epidemic. The new virus here has that signature mutation as well as on a phylogenetic tree can clearly connected back to, to so so that is almost certainly a persistent, persistent infection, which of course we've seen in the DRC more frequently these days, too.

Speaker 3

14:34

Yeah, there was another outbreak this year in the DRC. Right. And the same thing, I have very strong epidemiology there on that one, because, you know, the first case her husband was a bola survivor and you know, they could clearly link the the two viruses there and show that you know, we do some like two o'clock it's happening, people are sort of like a reservoir right?

Speaker 4

15:01

Yeah, I mean, I think it's I mean, it's, it's interesting because I mean, to me, it's humbling. Because if you'd asked me in 2014, do you think Ebola can stick around in a person for five years and then cause a new outbreak? And I was said absolutely not. But it's clearly happening. And we saw several versions of this during both the West African epidemic, but also later in the DRC. They weren't quite five years, but there were long periods of time. And what's interesting here is that the virus during this period of time, it really doesn't evolve very much. Which is interesting, because Ebola has a very clock like structure, we can see its evolution during an epidemic, it's it's very predictable, even between epidemics, it's highly predictable. But when it finds itself in, in these latent infections, or whatever it is, in survivors is that it sticks around, and it doesn't really change much as an IT DOESN'T evolve at all. And that's a clear signature, which is interesting, because it's the opposite of what we see with SARS, cov, to where we think that there might be these persistent infections that actually have accelerated evolution, I would say the emergence of the alpha variant to be 117 first detected the UK is an example of that where the molecular clock wasn't lower, it was much higher. So there's clearly differences there, which we don't fully understand. But this is important and interesting, for sure.

Speaker 2

16:28

In the case of Ebola, do you think it's something where the virus is very slowly replicating? And, and or truly latent? Or, I mean, is there any evidence that you can get, there's definitely

Speaker 3

16:39

clinical latency, right, because the patients are not showing virus, at least in the periphery. But you know, are there sanctuaries, where the virus could be replicating at a lower level, say, in men in the testes, for example, or in the CNS where, you know, it's possible, it could, you know, maybe come back and forth. One of the important things, though, is it does seem to be linked to immunity. So it takes a little time, and for the viruses to re emerge from a person that is clinically lightened. And so there's a thought that perhaps you know, that vaccines could be used to maybe control that in survivors, I think it's important for the audience to know that, you know, this Ebola survivors and in Africa, I mean, we need to be paying a lot more attention to those people, they have ongoing medical problems. This just reminds us that those survivors are important if they particularly can be a source of a new outbreak. So we need to really be you know, working with the countries there that have large numbers of survivors, maybe give them some vaccines that might be able to suppress the virus and keep it from from being passed on to another person. But those are questions that, you know, require a lot of research, but you know, they're definitely important to be doing those kinds of things. Great.

Speaker 1

18:00

All right. See, see folks virology is cool. My other things we can talk about. Alright, so back to this preprint. posted on June 22. Well, I guess that's the one I have anyway, I guess that's when I first saw June 22nd is probably been updated. And as I said recovery of deleted deep sequencing data sheds more light on the early wuan, SARS COVID to epidemic. So let's start by so this is about some sequences that were supposedly deleted from what's called a sequence read archive. So maybe we can start with explaining what that is. I don't know who would be better. Bob, one question. Do that. Yeah, for sure.

Speaker 4

18:40

Yeah, I mean, the sequence read archive is under the NCBI, which is a US resource us control, which is the ability to submit raw sequence data. So sequences come off a sequencer and then you take that data, that's the raw data, you generate genomes based on that, on this particular case, amplicons, because it was actually for the purpose of developing a new diagnostic that was using the nanopore platform. But the raw data can then be submitted to what's called dsra, an archive, which is something that SAS cuff to, for example, many labs do, but many labs don't, I will say probably most labs don't actually submit to the, to the ssra. Because it's the raw data. It's messy, it's big, it's difficult to submit, it's difficult to manage. Most people focus on submitting the actual genomes, you know, mostly full genomes and then do that to GSA, which is a different different entity as well as some time to NCBI GenBank to so dsra is just a place to put the raw sequence data, I

Speaker 3

19:53

would say about almost 2 million sequences in GSA ID that I mean that majority of them do not have those those raw, the raw data, the sequence read archives, posted or any on on NIH or any other database like that it's actually kind of a rare thing to do. I mean, it's not a bad thing to do. I mean, you should probably be encouraged to do it, it helps. But, you know, it's not the usual thing to do.

Speaker 4

20:21

Yeah, I'll say, you know, we definitely all should submit the raw data. And I will say, from my own group, we do do that, although we are much delayed pushing it to dsra. We do it on Google Cloud, for example, and make it available in real time there. But to Bob's point, like probably 90 95% of genomes that are there probably don't have the raw data on dsra. But it is absolutely i mean, i totally agree with Blum on this, that it would be advisable to have all the raw data available on things like dsra. And I think it's important, so you can test and check how good the genomes are, what confidence Do you have specific genomes or concerns about contaminants and sequencing errors? For example? Absolutely. But it needs to be clear that from most cases, it's not really something that's, that's that's being posted for most projects.

Speaker 1

21:15

So you would, this would be all the sequence. So any contaminants in the sample would also be uploaded as well. Right?

Speaker 4

21:24

But possibly, possibly, yeah, I mean, it depends on exactly what the source is for. For many people, for example, you, you assemble your genome. So you align reads to your genome, and then you submit all the reads that are aligned, which is then only going to be softcup. to re Sure, sure. But in other cases, meta genomic sequencing where sure you sequence did the virus, but you also sequence whatever else is in there, which is then submitted. So it really it really varies, depending on what specific

Speaker 3

21:57

it's a highly specialized database that I mean, unless you're doing genomics and sequencing and actually looking at these questions about things like the snips and how deep are the reads and things like that? I mean, the vast majority of even pyrolysis there are people even studying the viruses. We've never, you know, wouldn't go to the SRS to look at that original data. Unless there was just a question that came up about it. Yeah.

Speaker 1

22:22

Okay. Got it. Alright, so what what sequences is is our Jessie saying were removed to who put them there? And what was the purpose and so forth?

Speaker 4

22:34

Yeah, yeah. So I mean, that's, that's the authors of the Wang at all authors that had a preprint back in March of 2020. The paper is about developing a nanopore diagnostics. They amplify up some amplicons, some part of the SAS cuff to genome. And then based on that they can see, well, if you get an amplicon, that's a positive sample, but you can also get some information about specific mutations that might be in there in the genome. And those genomes are woefully incomplete, that it's only you know, I don't know about one KB or so wonderful genomes. 30 kB, right. So it's a really small part of the genome, but includes part of the spike spike gene, for example, in which we see a lot of snips. So that's what the paper was in the papers about the diagnostic. They have a table one, where they actually show all the snips that they find in the data that's both in the preprint. And in the later paper, which was published in a journal called Small, which I believe is mostly a chemistry journal. It's not a no name journal. Some people have assumed that as an Impact Factor above 10. So it's actually a pretty, pretty high impact journal and a specific special for these kinds of projects. Right, where you're trying to develop a new diagnostic.

Speaker 3

23:55

Yeah, I actually looked that up Christian. And, you know, there's been an insinuation that publishing in this journal small, which is, you know, it's nanopore sequencing, you know, so I mean, that's a good place to publish something, you know, about, you know, a diagnostic using nanopore sequencing, right? It has actually has an impact factor of 11.5. Now, if you if you look at, you know, the top three general biology journals, you know, Journal of neurology, Journal of general biology and biology, that the impact factors as those journals do not add up to 11.5 Okay, so I mean, if the insinuation was, is that the authors were hiding this data somewhere. I mean, the place to hide it is not in a journal that has an impact factor that's, you know, higher than PNAS and a few of other other very important journals, its smallest is pretty well read. So they, they're diagnosticians. They're developing diagnostic essays. This is a perfectly reasonable place for the put that data and they weren't trying to hide anything.

Speaker 1

25:00

So how? So these authors uploaded the data for their paper to the csra? Correct? Which is something they decided to do. And then, according to Jesse's, some of it was taken down, is that right? Yeah. So

Speaker 4

25:16

and then then they later deleted that. So I believe and we actually have because because Blum just updated the preprint. And we actually have the email from the authors to the NIH, which which to have shared with Jesse, so. So we know what, what reason they gave. And, and the reason is that while I'm just reading from that email, now, I said that I found that it's hard to visit my submitted as IRA data and will also be very difficult for me to update the data, which is all correct. I have submitted an updated version of this survey data to another website. So I want to withdraw the old one, and then CPI in order to avoid the data version issue. So and this is in June, so I believe around the time where they they've published a final paper, in small, and I will say, you know, deleting data is obviously not optimal. That's not what we should do right data, we should put it online, we should make it open access, we should have it all there. But this is not unusual. This is this happens often. And the data version issue is a real one, because they're sorry, is difficult to access, it is difficult to update your data on there. So if they wanted to put it on another website, again, not ideal, right? It should be available. But but that's a perfectly legit reason for why you might and certainly the data version issue is a perfectly legit reason for why you might want requested it get deleted by by NCBI, which is when I read this is like yeah, I mean, I bet that makes sense.

Speaker 2

26:54

There'll be a way to just, I was gonna ask if there would be a way to cross reference it to like, leave it on the Sri just in general going forward or whatever, leave it on the ssra. But then have something that says, you know, for more manageable or different or something, see XYZ location. Now, you

Speaker 4

27:15

will probably delete it under those and then maybe create a new bioproject if you wanted it on dsra. Of course, again, these are from Chinese authors, I should say that they say they wanted to put it on another website, maybe they have done that. But we don't know what that website is. We don't know, it might not be a public website, you know, things like that. We don't we don't know. And again, it's not optimal, right. But it's not suggestive to me that they're trying to hide anything. Because again, this is commonplace, this is something that happens often, not just for SARS, cov. Two, but for sequencing projects in general.

Speaker 3

27:51

And I think the other important point is is that in table one of that weighing paper, all the important data is there are any the snips are there, you can totally reconstruct everything about those genomes with the data that's in in table one. And in point of fact, you could do all the analysis that that Jesse Blum did in his pre plan, using that data from from table one. So, you know, I mean, is that what they were trying to do?

Speaker 4

28:21

I mean, table one in their, in their the Wang at all, both pre printed papers, basically, table one and Jessie blooms own preprint. Right, which, which is reflecting the same snip data, right? I agree that it's not optimal, right? trying to pull the data from a table to reconstruct genomes is not optimal. It's better to have the genomes but neither is it optimal to go back to the to the raw data and try and do it from there. Right? None of these things are optimal. And yeah, small is not the optimal journal for this. But again, this is, you know, science is rarely perfect. And there's nothing unusual about this, we can all agree that Yeah, there will be better ways of doing that. But that doesn't mean that because people didn't do it in the way in which we think would be the optimal way. That doesn't necessarily mean that they have something to to hide and trying to to obscure anything, which which the thing is, is a problem in this particular case. And

Speaker 3

29:19

I mean, I think for these authors that the journal small is a perfectly reasonable place to publish it. I mean, it's got a really great impact factor, and they're probably targeting the audience that they really want to target with that journal. So no, nothing surreptitious there.

Speaker 2

29:36

I just want to point out for our listeners, that snips is a shorthand for single nucleotide polymorphisms. So if the snips are listed in a table, then all of the sequence around those snips is wild type or consensus or something like that. And so that's how you would be able to construct what the rest of the sequence is, you know what the single nucleotide polymorphisms, or changes and from that, then you can generate the rest of the seeker.

Speaker 3

30:06

So, so another in Christian just mentioned, and I haven't gotten a chance to look at the revisions that Dr. Bloom made in the in this in this preprint yet, but I did notice just the first look at it, one thing that he did do was to delete figure two of his first version, which was an email that was obtained by a Freedom of Information Act request from I think they're an animal rights group, I'm not sure exactly what their their role in the origins are. But they had obtained an email about another group who had submitted some sequences about pangolin coronaviruses. And they made ask for part of that data to be removed from the archives, I think that they, it was not Christian man, you're better than me, but it's still up there on the on the NCBI sequence,

Speaker 4

31:03

that data actually returned, that they were concerned about contaminants and that so they didn't, which that happens, often, I should say, like, these big sequencing projects, where you're sequencing a lot of viruses, contaminants is a is a common issue you have to deal with, you might not immediately realize that it's contaminated. I think a good example of this is, you know, we used to sequence a lot of when I was at the brode, receiving a lot of loss and Ebola, they still do. And those reeds ended up in some of the G tech data sets, which is completely unrelated, right? And all of a sudden, you realize that these people from Boston that I've lost, only they don't they contaminants, right? So in these cases, there's saying, look, we need to remove that. So it doesn't confuse people I re upload the data will do whatever that I believe is what happened with these pangalan Coronavirus papers example. But of course, it's totally unrelated to to to this particular study here.

Speaker 3

32:01

Yeah, so I'm glad to see that Dr. Bloom deleted that figure too, because, I mean, I felt that was one of the things that was really totally inappropriate with that, that preprint in, in some ways, so first of all, the authors were Chinese. So you're using a, an example of a, you know, a removal of a sequence arguably related very closely to the origins of SARS Coby to and, and the only reason that example existed was because of this Freedom of Information Act request, clearly political in nature. And, you know, to include that, in a what, substantively a scientific paper, I think that those are the types of things that scientists need to avoid. I mean, you know, science and politics don't really mix and we want to try to look at the data, you know, that's available, but, you know, put something in that, you know, that, you know, I mean, just points to, you know, Chinese is covering up and doing all these things, which, you know, it's the Chinese government has not been totally open about, about the pandemic, there's a lot of things out there that we would like to know, and that that hasn't been described, but, you know, to really hanging on some scientists that didn't have anything to do with this paper that he's, you know, making a case they, you know, pulled the data and, you know, that kind of thing. I thought that was one thing that was totally inappropriate.

Speaker 1

33:32

Was it you, Bob, who said on Twitter, if you mix science in politics, you get politics?

Speaker 3

33:37

That may have been Pieter de sac, but I think that's one of the wise things if you didn't say, yeah, yeah, yeah, politics. And, you know, I mean, nobody is naive enough to think that, you know, politics don't play a role in science. And obviously, we have to get our funding, you know, through the political process in every country. But you know, I mean, science. I mean, maybe I'm just, you know, I'm, maybe I'm pollyannish about this, but I think that we we need to have, you know, we need to have, you know, sort of a dividing line, a firewall between science and politics and, and scientists need to be able to collaborate and they need to be able to collaborate, in spite of the fact that, you know, there are political differences between the countries, there's some really excellent scientists in China. I know that because I've trained a few of them. And you know, and we've been doing that for decades, building that scientific enterprise out there. They spend a lot of money, they're really making excellent things. And, you know, if if scientists can't get along and not accuse each other of doing, you know, surreptitious things and, and, you know, hiding data and doing all this kind of thing in a, in a, you know, in a, I think we're in deep trouble. I hate to see that. See the field going in that direction.

Speaker 1

34:54

Yeah, that's something we could certainly dwell on for a long time. But back to the data. How does So how did Jessie get these sequences?

Speaker 4

35:04

Yeah. So it turns out that the deletion that NCBI didn't actually delete the where the data was stored at the NCBI has recently been set up using the Google Cloud and the Amazon cloud. So they recently transitioned to that. And I think what was supposed to happen was that as a deletion request comes in, and they execute that, then the all the data is removed. In his particular case, it turned out that actually the data that lived back on the cloud, specifically on the Google Cloud here, wasn't actually deleted. And Jessie was because they have a standard file structure was with unable to, to recapture the data from from from that structure.

Speaker 1

35:45

Yeah. Okay. All right. So now we have these, how many sequences are we talking about? Roughly? Does anyone have

Speaker 4

35:51

I think it was a total of like, maybe 45 or so. Most of them are incomplete. Again, this is nanopore sequencing is not for the purpose of doing phylogenetics. It's for diagnostic purposes. I think, the final set that bloom then is looking at I think it's maybe 10, or 13, maybe, okay, yeah.

Speaker 1

36:11

And these are these covers spike through the next, or fright, is that correct?

Speaker 4

36:16

Yeah, I think they call a spike and then maybe into to off 10? Or some, some, some some to reach to Yeah.

Speaker 1

36:23

Alright, rain. So from the title, he, you know, he says that they shed light. So what kind of light? Does his analysis shed?

Speaker 4

36:31

I mean, I think, you know, what, what needs to be clear here is that getting more data from from early, full length genomes, some early cases in Wuhan is critically important. And it's, it's something that I mean, we mentioned as much in our proximal origin paper of last year, we mentioned that some of the most valuable data we could get is to get genomic information on some of the very early cases. And they haven't been forthcoming now, because of individual scientists, but I'm sure you know, China, again, is obfuscating a lot, including some of the early early events in Wu Han, which is a problem. Animal markets, right? Yeah, well, yesterday, but very specifically, obfuscating the the animal markets, but I think also just, you know, getting you know, the samples may no longer exist, the samples might exist, but were degraded, they might not be able to access them fire bu issues, like there's regulatory issues for why you can't just go back and saying, like, Look, I really want to go back and get all these samples, so I can sequence them, we can all agree that that would be incredibly beneficial. And I would say, more than really early sequences. So we're talking like December in particular, right? The sequences recovered here seem to be from January and February. So yeah, they are early, but they're not some of the earliest. And again, any new data here is going to be valuable. And this particular data adds a little bit to that. But I'll say it doesn't really change anything. It doesn't add anything we didn't already know. But but but more data is always better.

Speaker 3

38:13

So there's actually a lot of misinformation about that. I mean, even in some of our mainstream media publications, you know, they're saying this data sheds light on cases, you know, the earliest cases of SARS could be to in Wuhan, and, you know, cases before any other case that we knew about, actually is what the implication was from some of the articles in the Washington Post, and, you know, some some popular Twitter users that like to, you know, talk about the origins, you know, have have, you know, have really put out a lot of misinformation about this, this preprint saying, look, this, this proves that the virus didn't originate in the Hunan market. Now, you know, Dr. Bloom on his paper is contributed to some of that in a in a large way. But the reporting has been really, I think, substandard there, I mean, the sequences from January or even February, you know, don't tell us about, you know, it leaked from the lab or anything like that, even though that was the clear insinuation from a lot of the reporting. And unfortunately, I think, you know, that's been sort of shepherded along by people who are advocates of this slab leak hypothesis.

Speaker 4

39:31

Yeah, and I'll say maybe just to put a little specifics on it right there. What we knew based on previous data, is that we had two early lineages, which we see in China, including in Wuhan, one is linear che and one is lineage B. And the new data here adds to like, their linear channel in SP so they add to that existing knowledge, right saying that we know that these these things are circulating I think what Blum does is that he I think he misunderstood a little some of the earlier data, for example, the way that the data is labeled, which is something that for myself and many others who've done folic acid already corrected for all these things that we see already evidence of linear che MB in Wu Han most of them Alinea to be. And lineage B is when we see what we detect adequan on market, what was sequence there is all in hp. That's not to say that linear che wasn't that the one on market, because there was probably super spreading at the one our market, right, but we just never captured linear che because we came in too late. We didn't see we didn't sequence enough we didn't sample enough from from that market. What we know is from the who report, and in fact, we have known this since since last year to where it was reported in some preprints is that there's this linear j is associated with a different market in Busan. What that market is exactly we don't know, it's not mentioned in the who report, which is unfortunately, I've asked questions about that. Can we please know what this market is? And it's been a blank wall so far, so we don't know. But we already knew that these two lineages were were circulating in Wu Han specifically, with the one dominating it at the one on market. Right. And then again, the other lineage associated with a different one.

00:0000:00