Secure Data Storage - WG
7:52PM Jun 11, 2020
that's pretty cool.
Where Where in the world are you?
Near near Berlin just like in the suburbs basically. Nice. Beautiful puts down.
camping and standards they go really well together.
But what they guys still do and I'm on vacation. This is the 10 standards.
That's the spirit
I guess it's not a holiday in the US
is not feast to the Pentecost. I think it would be if this were an orthodox country.
So welcome everyone. We're going to give another minute or two for people to file in.
So now I have bandwidth for zoom but not for the agenda.
Maybe from you.
Clear Do you? Are you co presenter? Can you also make me
or do we are waiting for blog? I'm not co presenter. I'll go look and see if we have the codes. All right rights codes. Yeah. Okay. Well, while you're doing that, oh Let's see if this works. Oh, yeah, no, I don't have the key.
Okay, so it's two minutes past. Let's get started. Welcome everybody. This is the weekly secure data storage Working Group. Call the
agenda. And I wish I wish to zoom displayed.
Chat since the start of the meeting when you join I think
jitsi does. Maybe it's like a setting in zoom or something. Anyways, here's the agenda and agenda link in chat. Quick reminder that we use either the zoom raise hand for queuing up or you can type q plus in chat. Yes. Another quick reminder, this is an intellectual policy restriction policy protected call, so In order to make substantial contributions, you must. You must sign the IPR policy. And that can be found at the bottom of the
secure data storage wiki. So here it is.
So again, I encourage everyone to both join the group and unrelated Lee sign the
Thank you, Dan, as Dan points out substantive contributions, meaning those that have normative impact on implementations. So yeah, sign early and often, and let's get started. So our agenda today is we're going to do introductions. We're still early enough that reintroductions don't quite make sense. And we'll go over a couple of layers naming problem And then hopefully we can do a deep dive into the base layer. We also want to give another shout out to Christopher Weber and Serge who are joining us from the land of data shards, a similar and hopefully compatible spec
that we can talk about towards the end of the meeting. Stephen, I see you're on the queue.
yes. Sorry. I just saw I saw it don't put it in productions. New people present
Please Why don't you Why don't you go ahead. Yeah, if that's what it means. Okay, so
what I'm Stefan by represent grip tronics. We are Spanish cybersecurity company with a strong focus on decentralized systems and cryptography. So yeah, we just signed up for this group recently signed the IPR form and just thought I really we're presenting myself.
Welcome. We're glad to have you got one Have you? Have you done the introduction? yet?
Um, I don't know that I have.
Let's do it. Yeah, sorry. I My name is Juan I work for security which is a German company in Dortmund. I'm currently in beautiful Potsdam.
We're an SSI company. We don't have immediate plans to use sts for anything. I'm just sort of
curious to see how it plays out.
All right. Thank you for the introductions. Let's, let's get started. Are there any?
There is a certain introduction.
I'm My name is Clive canola. I'm working for slock it. This is a gentleman blockchain company and we are developing have a base on the Newport Ah and framework, an own SSI identity solution.
Fantastic. Thank you and
from everything I hear your tech and your client implementation is excellent. So glad to have you in group. All right. Alright. Before we get started, are there any questions or announcements from it?
how to introduce it so let's let's just jump right in. So one of the
wait real quick Kalia. I'm trying the host key and it's not does not seem to be valid.
Okay, where do you put it? You You hit claim host
which has Right next to the raise hand button, and I believe that it allows it asks you to put in that code. Okay, I'll keep searching to see if we send a different one in a different. Okay, sounds good.
All right. So
one of our primary challenges, especially in the beginning of this group, is to agree that secure data storage, both the spec and our conversations have to do about layers need to come to consensus about that, and what to name the layers. And then of course, we need to move through each one of them and actually write the specs, right implementations and do all of those good things. So today, we wanted to approach it from both ends. We want to talk about two issues that are proposing actual concrete naming proposals for the layers. And then at the other end, we want to do a deep dive on the data layer the lower layer of bytes storage, and hopefully there's there's fairly wide consensus there but if not, we'll discover it. So who wants to wants to introduce the naming proposals or would you like to?
I can, if there's an issue that we're there is
an issue. It is issues 74
at lunch chat, so that people Yes, all along.
Yes. So it's, it's linked off of the agenda document and here's the link to the issue specifically. It is.
I'm not sure what Daniel posted but gaming All right, issue 74.
Yes, this is my issue and a link directly to the issue for those who want to just get directly to it. So we've been talking about layers. And I think one of the things that's been confusing has been the spec already has this concept of layers. So potentially members of the working group are trying to read the spec and have follow along with the layers conversation and it's just confusing because what what's layer zero, blah, blah, blah, like it's unclear. So this proposal is kind of to to first of all, give a layer ordering system that isn't in the spec. So that's why it's the alphabet proposal. And to start at the lowest possible layer, and work our way up through layers until we cover what we believe is the desired functionality. The working group is attempting to achieve. So are any questions about the issue? Before I try and summarize the first layer?
So I have a quick procedural question. Do we
the other naming, layer naming issue Kyle's? Do we wanna talk about after this one? Or do you wanna introduce them? Both?
Yes. Where is the
issue? 44 right here.
So, issue 44 is raised by Kyle. Kyle, are you on the call? All right, I will summarize his issue. So he's actually first to suggest that we needed better names for these layers. And there's a discussion on issue 44 that all of the layers names are kind of very high, higher level layering names. And I think while it's a great idea, it's not specific enough to avoid objection. So I'm hoping that the reason I created the alphabet proposal is I think that it's even more fine grained such that we can agree on a first layer and move above it. So that's why I prefer to discuss the alphabet proposal and not Kyle's proposal. But if but if we need to cover it, we can go ahead. The objections. Okay. So I'm going to move forward. We're on issue seven. Oh, wait one second. We have a
raised hand from mum.
Oh, yeah. I just wanted to point out, you know, it's I, but both both the base layers on both of these proposals effectively looks like the same thing, right. I mean, one of them is is byte storage in With the live stream, I forget the other one. Kyle's is, which is layer one. I mean, it's just basically like storage, right. So it's
like secure scoring.
So it feels like both proposals, at least at the base layer start out in the same place. I just thought I'd point that out.
I agree with that. Cool. So I'm gonna, I think the just recalibrating, I've been programming all day, we discussed focusing just on this first layer on this call correct layer eight.
Yeah, so we want to give an overview of the layers and unless there's strong objections in the group dive into layer a.
Okay, so I will review each of the layers and then we'll dive into layer a. Okay, so layer a is byte storage. And the idea behind it is that you're building some software and you need a place to store bytes of data. So however you want to identify those bytes that you want to store, that's a question for this layer. And the interface by which you want to store those bytes is the question for this layer. So for those familiar with MongoDB, or sequel or any database, create, read, update and delete of bytes. And for ipfs users, this is like adding of ipfs content, getting of ipfs content from a hash, just raw bytes storage. So the reason that so that's, that's flaring, and then layer B is what I'm calling logical storage. But you could call it business object storage, you could call it vault data model storage, the distinction is that it operates not on bytes on logical objects that have a specific schema that we could define. Formally, right. So, today, if you read the EDP spec, there's actually three sort of primary data model objects that are described. There's a vault object, there is a document object, and there's an index object. And they're each described in JSON and the spec today, and they could be persisted in a numerous different ways at the byte storage layer. But when you look at them, they look like objects. And they are labeled with friendly names like vault and document and index or I'm proposing that they be labeled with friendly names like that. And I'm proposing that when you see a vault object in the future, you'll know Oh, it always has this ID property. It always has these, this index property here, which may or may not actually be co located with it. It might have some other properties That are known. And similarly for documents, they will always have an ID, they might have this index property here. And they may have some JW representation of the document, they may have links to encrypted streams that are associated with the document, things like that. And indexes. Similarly, a JSON object that has relationship information about indexes encoded in JSON. So logic, the layer B is the logical storage layer. And it operates on a data model. That's one layer above to storing raw bytes. But yeah, so that that's pretty much it. And then one layer above that is layer C, and feel free to stop me. If there are questions as we go.
I love crash question from Adrian, go ahead. Ah,
I'd literally for example, there's a permission model. Either because somebody has to pay or something Somebody asked the right has to have right access even though they're not paid for that bike storage. Can you say how that how you see that propagating? Can you say something about that?
So, so nobody stores bytes for free period. So usually, you're going to be paying somebody to store the bytes. Or you're going to be using a system like ipfs. And you're going to make sure that all the bytes you care about are stored. Or you're going to pay somebody else to make sure that those bytes are stored, that kind of thing. So you know, MongoDB, you can actually sign up for MongoDB today, and it'll be free up to a certain amount of utilization, right? That's the way MongoDB works. ipfs similar, like you can register with certain ipfs providers and some amount of data will be free. Or you can just start throwing data into that network and hoping that somebody persist sit somewhere without paying, which is not a good strategy, but it is a thing that you can do. So the point is basically, at some point, the bite storage layer will become big enough that someone will come asking for money or they'll stop accepting stuff, right? And the payment regimes like how you choose to sell this are out of scope for the layers conversation, but what is definitely true is that everything you would need to figure out how you wanted to price This is in these layers. And it's
a question. My question is, whoever's getting paid has to have an email address to send the invoice to or PayPal address to extract money out of and that's what I'm trying to figure out is in layer is it the same way as it is in layer B? Nevermind layer c
a. So I think The answer to that is there is probably a single entity that is working on layers A and B, that well, you know, it could be could be multiple entities and the payment scheme could become very complicated. But I'm going to assume right now that there's a single entity that's doing layer a and layer, we can
call it payment call it notification issue because it leads us to identity and it leads us to privacy issues, that to the to the decentralization rubric. So I didn't mean to take you down how much it's gonna cost or whether it's free. I meant to take us down the path of who controls layer a who controls they are be notified when there's an issue. Right.
That's an excellent question. We do have a queue. So I want to make sure and I suspect some of some of the people queued up, might have an answer. But you're absolutely right. That is that is a crucial part. Have a discussion for later. Hey, Jonathan.
I'm certainly just a, Larry just really I would prefer block storage. So there's actually much more layers of logic that actually happens in MongoDB and or ipfs, where how it's chunked and stored. So it's really it's blocks. Storage, I think would be more preferential and raw bytes. And there are layers of logic on top of layer a or inside layer a before we get to layer be
I mean, I certainly like that. Can you leave a comment if you haven't already on that issue?
So thank you for that, man. Oh, go ahead. You're next.
Yeah, I just wanted to highlight something that Dave lonely said in chat. So folks should go in and read that about, you know, the the payment stuff in who has access is really kind of a negotiation between the layers, right. So the person that's running logical storage layer B may have a relationship with someone that's running layer a. But just to be clear, you know, what we're trying to do is make sure that we're trying to make sure that the layers are correct. right in and we're trying to start with the simplest layer here, which is you know, byte storage, block storage, whatever you want to call it. The other thing I want to make sure that we don't do is overcomplicate things right now. Right. So, Adrian, you know, one way to one way to make this really simple is ask you the question, Who do you pay when you save a file on your laptop? And the answer is nobody, right? You already paid for that storage when you bought the laptop in, you are effectively managing layer C, layer B and layer a all on your device, right? It's it's all local. So I you know, it's is interesting to talk talk about, like really complex schemes like ipfs in MongoDB. But I'd like us to try in those are all options for layer a. But when we talk about this stuff, let's try and simplified and just talk about layer a is like I'm storing bytes on my hard drive. That's what Larry is.
And so we have Sergey Next on The Cube, but I just want to add a side comment. So I think I see where you're going, Adrian, with the question that the payment and the relationship responsibility leads to identity.
Yeah, right. privacy issues. That's all that's
now so well, I think we'll talk about that in this deep dive because one of the things that we're going to mention is layer AES encrypted, or the encrypted byte storage or encrypted block storage. Like Jonathan mentions, he has an important part Sarge, go ahead.
Well, that was gonna be exactly where I was gonna go with my question, Dimitri, which is? Well, I guess two parts. One is whether or not the byte stream is encrypted with a block level storage is encrypted. Because, obviously, from a data shark's perspective, that's exactly where we think it should be is at that lowest level. And it's a little unclear here where the where the storage begins and ends in terms of block when we talk about block level storage. The second question was in relation to so we have this abstract notion of block level storage, is there an interest or an attempt to unify that block level storage or is it or is that left to the implementation? So because we've in the documentation, it's mentioned MongoDB and ipfs. And the data charts model is that that's Even a lower level. So we consider that a low level implementation detail rather than an abstraction that is left up to the to the developer who wants to build applications on top.
Can you add a comment to that effect to the issue so that we can have that discussion?
Which issue whichever
the second comment, I think, is what Ari was referring to, which is, if we're going to talk about block storage, then we want to just point out that like raw bytes storage, whether it be Mongo or disk or anything, is a layer even below that, and that is out of scope. Yeah. So that's a good point. And I, at least, at least I have the same mental model as you. So add that comment. Who wants to take?
Fair just first question about the encryption part.
I would move to table The concept of encryption until we get through the list.
I like it.
Because I think it's very easy to just have an open ended conversation around it. And I think if we can agree to get through the list and then focus on layer one, we're going to start talking about encryption as soon as we get
Okay, cool. Any other cute questions we need, we should cover?
Well, I just wanted to respond to mommy's point. I see the issue as a privacy issue, not not as anything else. Now, if you want me to table that and put it down as a comment, I haven't read all the pre all the recent comments and 74 I can do that. But it's a pure privacy issue. So saying that it's not because it's a payment complication was just misleading on my part. It is a privacy issue.
Thank you, and we would we would like to table that discussion. It's a very good point.
I think We're gonna get to it as soon as we get to later, right? same same, it's privacy and encryption are probably very related in this group's mind. And we're going to have to tease them apart and get really concrete about them. And we're going to do that as soon as we get to layer a, and we're going to probably spend the rest of the next couple calls just on that layer.
Alright, I think that's everybody. Go ahead. Okay.
So layer B is some kind of logical layer around some concepts and layer. Layer B is really the beginning of a developer oriented API. So it's giving developers a set of friendly interfaces that are abstractions that are useful for developers. That's a really important distinction between layer B and layer a is developers like abstractions built on abstractions built on abstractions, and we want the highest order abstractions because they let us do a lot very quickly in it. clearly communicate the way but abstractions need to be built on top of each other. Layer B is a logical storage layer, which we get to define what those data models are, we get to define what the interfaces for them are. And that's we're going to be talking about that when we're talking about layer B. Layer C is about permissions. And you can think of layers in question
about layer B.
I would ask, is it in scope to say that things that are like an index might be separated from things that are like a vault or a document? In other words, that separation of concerns, again, from a privacy and consumer perspective, would want to separate those into two separate entities that are separately chosen?
Potentially? Yeah, those are?
Yeah, that's in scope. And that's why we're separating layer a from layer B. So so there could be multiple indexes. pointing to the same layer. Right? Good.
Cool. So the logic layer C is about permissions. And I think the easiest way for people to think about this is in terms of an HTTP server with no permissions on it. That will, right. So think about the ipfs API with no permissions in front of it or, and by that for people familiar with ipfs, I mean, the one that runs on port 5001, that will write data and store data to the node and another HTTP service that sits in front of such an API that would decide whether or not to allow a right to take place. So if you think about an API that will do crud without question, and an API that will decide whether or not crud is going to happen. That second thing is a permissions layer. And it could be a proxy in front of an HTTP service that has no concept of permissions or it could be embedded inside of some other system. But the point is that permissions are a layer above a raw API that does crud, okay, anybody can ask for create, read, update, delete on a database, whether or not that should happen needs to be enforced at some layer above the raw ability to create, read or update or delete. And you see this throughout software design, you know, in in the way that web servers and databases are typically linked. You see the logical permissions layer, some of the technologies listed here, oh, off, transactional off, those are access token based permissions models or token based permissions models, right. And z caps is another permissions model that exists at that layer. And the important part is that for both of those, those permissions concepts, they're not about permissions on bytes. They're permissions on a logical storage data model, which is to say say that, you know, there might be another permissions model somewhere lower closer to layer a that's about permissions on Raw bytes, right. But when you're talking about an externally consumable interface, particularly a REST API, you're talking about a set of resources on a on a specific server. And you're talking about operations on those resources. And the permissions model at that layer is, you know, what changes to those resources are going to be allowed and how you choose to enforce that. There are many options for that. But the simplest way of thinking about that, in general is can a roll take an action on a resource, can an admin, delete a table, right? Things like that. You can express these concept of permissions, many different ways. I'm just putting it all in this layer that exists above logical storage. And permissions is a very complicated topic. So actually fact that we'll have even more debate on it. But I would we'll take questions on it now now, but like we Yeah, let's take
take any questions on it. Yeah, we've got a couple of people on the queue. Sarah, just want to make sure
you've had your hand up for a while. Is that from before? Or do you? Were you queuing up for another question?
Oh, that was from before? I guess. I do. I don't know how to take
my there's a lower Hill. Yeah, no, I know. Like, yeah, happens. Daniel, go ahead.
Yeah, I was just wondering, you know, it is. I think the interesting thing here is that at the end of the day, it'd be super super useful. And you know, that that higher level later we talked about, actually having knowledge like in your one issue we wrote about how many of a certain type of object are there, you know, actual type like semantic like friendly language, type, weather and hub or give me all of these other types of objects are ones that have this other thing in common? Like that. That's what's going to make this like a useful system. I think, you know, And it kind of came up here in the discussion that people saying, Well, I guess you could break this apart where you had one completely totally separate person doing that indexing. And then the, you know, the data store itself, the thing that stored these encrypted bytes was, you know, on a completely separate location. And yeah, I mean, I guess if you layer these things that way, that's correct. That that could be true. I think the thing that scares me a little bit is, what is the reference implementation going to be? Because, wow, that's totally possible. If we don't have something that is put in a box that basically can do with the identity stuff can do, I would, I would be shocked if we got anywhere close to the same adoption, you know, if it was a collection of things that like kind of didn't really sing together as one thing in a Docker container that did the high level stuff, you know, you know what I'm saying?
Yeah, so we're trying to get all the way through to the layer that would meet the identity hubs interface as as I I understand it, which I believe is at Layer Layer D or layer II.
Yeah, I'm not I'm not talking about that, I get that, like, I understand that you're saying they might have all layers together is going to do that what I'm saying is in the reference implementation, are we going to have them all together? So it looks like one cohesive thing? Or is it just gonna end up being a bunch of components?
I think we can really safely have a conversation about that. And until we spend a couple calls on layer a, so
but I want to I want to add a comment, Daniel, I think that's a really important and subtle point where you just said about separation of index. Can you leave a comment on that issue? As a reminder for us to talk when we do come back to it? Because I don't want it to get lost, like specifically user
privacy. To another issue that's already open. There is an issue that discusses visibility. And it's mentioned Okay, got it. Oh, really comments about metadata on every issue. If you want to discuss plain text metadata, please comment on issue 79
Excellent. Adrian, you're up next.
Yes, I want to ask, what's the difference between permissions and policies? In other words, are we going to have a policy based decisions at layer a and layer B and the player c? Or are we going to assume that, you know, whether you think of it as object capabilities or encryption schemes that are managed, you know, public private key encryption schemes, whatever enveloping schemes that all have the policy happens at level C, and the permission so instead of so that we're going to put all of the policy related stuff in low level car we spend policy along with permissions throughout pbmc that's my question.
Next Next one. Thank you. And I think Dimitri is going to get to the answer to that. So I'm gonna make sure process there,
mano, go ahead. Um, yeah, this touches on the question Adrian asked, which I'm referring to Dimitri. I think layer c might be slightly Miss named in that it might be Miss named and confusing way. It's, it's really about authorization, I think not permissions, although you can argue that, you know, the two may be synonymous with one another. They, so anyway, I'm noting that that probably needs to be teased apart and people shouldn't read too much into the word permission, or authorization or, you know, authentication.
That's it. Thank you mana, and I absolutely do look forward to arguing about authorization versus permission versus policy, which is another really good term, Adrian.
Yeah. So I wanted to step on the queue to add.
One, just to reiterate, what are you saying, Adrian? That is an excellent question about which layer authorization policy happens at, we will come back to that question. So, leave a comment if you haven't already, just so that we don't lose track of it. And the other thing that I wanted to draw people's attention while we're discussing is that several of these layers are very much meant to from the outset, to or at least, that that is one of the hopes to be pluggable to be interchangeable, including authorization policy, so keep that in the back of your mind.
Think that's everybody. Don't forget to lower your hand
glory So I agree with what you said, Dimitri. And I think, you know, you're going to see definitely data model enforcement at layer B, because that's kind of what that whole thing is about. So there will be some concept of enforcement at these various different layers. And we'll get we'll get to it. Layer D is attempting to start to reach into the realm of identity hub features. So identity hubs have a very message oriented architecture. It's a little bit hard to tease apart exactly how it works. But there are there's continuously a concept of messages and messages related to each other. And in particular, messages related to each other with plain text metadata that is processed in some way by server. So CRD T's are that's that stands for conflict free replicated data type and they are a way Handling concurrent updates to the same object in a way that lets the object dynamically be changed by multiple parties. They're very complicated database concept, which you could spend your entire PhD thesis on just by itself. There are various different databases that have different approaches to them. So couchdb has a method mechanism for supporting CRD T's. Firebase, which was bought by Google has a way of supporting conflict free replicated data types. And orbit DB has a CR DT sort of functionality that's built into it and or DB is in turn built on top of ipfs. So CR DTS are a mechanism that's fundamentally at a higher level than just storing raw bytes. And I would argue also at a higher low Mac level than having some concept of a log. Typically CRD T's are built off Have some ordered set of events they're common and event sourcing, which is a programming paradigm you could also spend a really long time on. So, layer D is job is to carve out space specifically to address the fundamental data model and needs of a of the identity hubs specification as it is written today. cr DT stands for conflict free replicated data type. It's a data type that can be copied and around to multiple locations and will automatically update as the stream of events are processed by those various different locations.
Aging you can think of it as an engineering trade off, you are trading space for some savings on strict ordering. So you're you're trading space for compute, essentially, but it's more complicated than that. Emmanuel, I think you're next on the queue. Daniel was is your hand raised from before? Did you have another comment? Okay,
go ahead, man. Oh,
yeah. Again, on the the naming of the layer, you know, thinking about it a bit more. It feels like this layer is really about replication. So message based and CRD T's are one way to solve the problem, right or to to conjoined ways of solving the problem. This is really about, you know, I have, you know, data here, and I want it over there. And how do I make sure that when, you know, the data is cloned, that I don't get a different copy over there? How do I make sure in real time? That's where some of the CR DT stuff comes up. Like if I have storage at my house, I have it at my place of work and I have it at my bank. How do I make sure that That all of those systems continue to synchronize with one another. So that if my house burns down, or, you know, I stopped working at my office that I still have two backups of my data that can't be taken away from me. So it's, you know, it has to do with self sovereign, you know, storage and that kind of stuff. But we may want to call this more generically replication.
That's it. Thank you. Yeah. Yeah. I think Jonathan's next on the queue.
And so a replication but reconciliation because it really is about the Delta, of what's changed and making sure everything is in sync. And this concept of last writer wins, at least for delta CRD T's.
Yeah, that's a good point. I think layer D is maybe a couple different identity hubs, concepts all smashed into one layer that could be teased out and separated. So I think that that we're going to get when we get to that layer, We're gonna have a lot of questions about that. And there's also as Daniel, Daniel Buckner keeps mentioning, there's questions about do we have what we need to make that layer work in the lower layers? Or have we lost it by the time we get there, and we're going to do our best to make sure that that's not the case. But we're also going to have to start at the lower layer and work your way towards complicated PhD level concepts like CRD T's.
Tobias, I think one more in the queue.
Yeah, I just wanted to clarify, again, I think I think the impression around some of the terms we're using, you know, replication was sent because if you had two storage providers that are aiming to, you know, make the same state available on two different locations, then replication is merely those two stores sharing that and that state but the synchronization as if there are conflicting updates that were made to those two different Sources of state is the reconciliation pot.
Do we want to take one more question? We got venue on the queue?
Yeah, I think we have to take questions. Let's do it, then
Ah, is replication the right place to be of authorization?
I kind of
wait. I didn't I didn't hear one of the words.
You're basically placing replication or authorization in the library. I'm just,
that's going to cause problems. Oh, yeah, for sure. Like you absolutely are not going to be able to just sink. I think there's layers of permissioning. that once you have once you hit the multiple instance coordination problem, which is a thing you hit as soon as you have a CR dt, then you're going to have permissions forever with you. That's going to follow those around. So yeah, like, everything above logical permissions assume that the permissions just get more complicated and that they don't go away.
What is what is wrong with putting it reversing the order?
I think it's it's more to do with trying to address the concerns around encrypted data, encrypted metadata, and the interfaces on top of that. And the two input documents that we have, which are the encrypted data vaults and identity hub specification. And, um, you know, we're about to get to lower layer D and be able to start over again, but essentially layers A through C are basically what encrypted data bolts are and they have no concept of layer D and E. Identity hubs are about layers D and E and they have no implementation. So we're stuck with a kind of awkward like attempting to put these two things together. And the proposal here is to go through the layers and then go back to layer a, and then try and walk forward until we get through identity hubs. And if we do that successfully, we will end at identity hubs with all the features that identity hubs have described in terms of markdown with the right set of permissions and privacy guarantees that we want. So I would say that the reversing of the order in this case is like, definitely, we're going to be changing these terms. And there's going to be paragraphs and paragraphs of text under each of these to make, you know, clarifying statements around them. But you know, for example, you really can't have a CR DT that's replicating across instances without some concept of allowing that to happen, right? It's not just a thing that happens unless you just open ipfs you know, Port 5001 on your public internet host and allow everybody to connect to it, which is the thing you might do and maybe that's that's going to solve you know, big chunks of this I would suggest that like, we're going to get to the questions around how permissions and authorization weave throughout each of these in great detail as we as we loop back around.
Okay. I wanted to add a comment, but I think already just just said exactly the same thing
that I wanted to add, which is
the layering, the ordering is correct. Because I've seen I've actually seen and worked on systems that tried to layer permissions over CRD T's rather than the other way around. And that's much more difficult. Sorry, it is a
it is a strategic
conscious decision to approach permissioning first, and then synchronization replication.
Yeah, and you can see it in things like Firebase which require authentication before replication happens. And similar for orbit DB mean, both of them have a concept of permissions before you get that higher level nice interface that you want. But I think when we get to that layer, we're going to really do some serious lifting and surgery to tease these things apart and make them as clear as we possibly can.
I think that's the cube. Go ahead.
Cool. So the last layer is, and I don't know that this is the last layer, but it's certainly what I'm proposing as the last layer. And I'd love for the working group to come to consensus because having no ceiling is a dangerous place for us to be. But what I think of when I think about identity hubs and their ultimate, like if I imagine that they exist today, and they're working exactly as I dreamed, and I imagine what kinds of stuff Am I putting in identity hubs and what are the interfaces that I'm getting out of identity hubs being real? I'm thinking about things like supply Did calm things like supporting hyper ledger areas connections and wallets and things like that, thinking about Xbox user profiles, being ad preferences is getting automatically shared as I wander around the internet, really like, these are interfaces that are built off of replicated data in a standard format that applications can request access to and be granted access to. So imagine that you go to a website, and the website wants to know your preferences, but it notices that you have an identity hub that's connected to it, they can ask you, hey, I need your I need your, your, your dimensions, your four measurements so that I can customize some screenshots of clothing to sell you. Right, I might want to selectively disclose that information to websites and identity hub might have already figured out a way of storing that information and expressing it in a way that could be semantically queried that would preserve my privacy but where I could still provide that information to that website, things like that. So I don't know, really, if layer II is layer is, I would say entirely out of scope for the working group in terms of building any specific thing in this category. But it's very much in scope in terms of considering the end interface for identity hubs. Because whatever that end interface for identity hubs is, it will be the thing that the edge between the work that we do in this working group, and the applications that are built on top of the service, and it's very critical that we have a clear understanding of the use cases, very concrete specific, like exact messages, exact data models, like clear examples that hit into that boundary so that we can understand what the difference between an identity hub is and the difference between these lower level concepts and it's built on. So that's that's it for For those layers and if we Yeah, so I think there's there's questions around layer. See, there's questions around layer D, and layer II, to me feels like something that's also incredible amounts of questions on and very much not concrete.
We have Adrian,
I think I'm on the queue. Can I ask, Do you mean published data structures as opposed to standard data structures? Because I agree with you. It might be out of scope, except that you mentioned because we're working towards the rubrics of decentralization implying that whatever we're going to call an identity hub or whatever that is, has to be standardized is a conscious decision. And would you settle for published data structures in order to
Yeah, I think so I think I would use that term. You know, maybe you're saying,
good point. And you know what I'm gonna say, leave a comment. So we don't forget.
Man, go ahead to your next.
Yeah. So I think this was great, right? I mean, I'm not hearing a tremendous amount of like pushback, certainly questions that we absolutely need to get to, and we will. But I'd like to see if we can go back to layer a and start really talking about it and seeing if we can get that kind of just laid down as a Bayes foundational layer. And what people would be concerned about if if we knowing that you know, this is this is at least how the layers are being proposed. If we can get that layer a like, you know, the group agreeing to that getting consensus around it that really helps us kind of build go on and talk about the next layer. And if folks feel like byte storage isn't the base layer then let us know like, what is the base layer then if it's not that, what are we building on top of? So yeah, I just maybe in the last couple of minutes, let's if we could go back to layer and start talking about if people agree that that's the base layer or not.
Sounds good. Thank you, man. Oh, and we will.
I suspect we will be continuing talking about layer a on the next call, wanted to call out real briefly. That one of the things that we'll need to start doing, even though it's fairly early on the group is to is to propose several topics specific calls. This layer a will be one of them. And, more specifically, the data shard folks have have expressed interest in interoperating with our layer a, so we would love to have a future discussion. And adopting or the suitability of the data shard spec, for example, as as the Larry foundation. Yeah, so as Dave mentioned, right, no, that was in, in answer to Juan's. Okay, so in the few minutes remaining let's let's dive into lere. Are there any stairs? Go ahead, you're on the queue.
So my question was actually going to be is that something we want to do in the future, all together call or maybe have a breakout call around this topic. Whether or not data to be cleared whether or not data shards is suitable for layer A.
So I suspect we want at least one more deep dive column Marais just so that people understand the issues involved, right, the criteria against which we're talking about suitable
Do you think
that's fine? I mean, I don't I don't have any opinion on that. I just, I suppose you're right, though, that it's a background that everyone should should have in common before discussing suitability or any other specification. Yeah.
So or do we have a layer a issue or Shall we open one?
I don't think we have a specific layer a issue but I will go ahead and create one
Nanos on the queue.
sorry, old lower hand,
So I'm creating an issue called alphabet proposal layer a placing it In chat, it's empty. It has no description. Don't ever create issues like this. I'm gonna cross link it to the alphabet proposal we just discussed.
Adrian, I'll raise the first issue. How is encryption? Is encryption required or optional or how I don't know how to state it but there's clearly this question about whether encryption is mandated feature. In other words, we have to assume that encryption is there even in the cases where you publish the key elsewhere you know, for all to see or not so I think that's a layer a conversation.
Absolutely hundred percent.
My head Issue, raise that question on issue ad. And let's let's start deep diving on layer a there.
Um, so are we raising questions or cuz I mean, I'd like to just assert encryptions assumed at this layer. I would be very shocked if people believe that it's not. Because I think by not having encryption, not assuming that the things coming into this layer are encrypted. It's an argument for very easy surveillance capitalism. Right. So I, I'm wondering if we can just pose the question, does anyone in the group believe that this layer shouldn't be always encrypted, like the bytes are encrypted, full stop at this layer. They're encrypted at a higher layer, the higher layer has to encrypt it and send it down to this layer. But when when we get bytes at this layer, they're encrypted. We don't know what they say.
And I just before before people bring up objections I wanted to echo day long. Lee's point that is exactly the analogy of the discussion of Do we want HTTPS Everywhere? Or can some stuff be HTTP? And the answer is even for a public site like Google com, it's still HTTPS. It's still encrypted, even though it is transparent to you, the user. And as an industry, we have essentially settled on HTTPS. Dinesh, you're next on the queue.
Yes, I just wanted to make sure that there's a distinction that it is the bytes coming in, that we're telling layer eight a store are encrypted, not that we are enabling some feature on some database to store their bytes encrypted. It's a it's subtle. Want to make sure that we're talking about the former that the bytes coming in are encrypted before we're even storing them.
That's a really Good point,
Just manner I think it depends on the use case. So I think Yeah, yeah, it assume everything is encrypted like HTTPS. But HTTPS is a transport protocol. And I think it really is about, there are certain things that in this at least in identity hubs that actually like you wouldn't want to be public. And I think it really is about authenticating the underlying data. And I think that gets into some more details of the authenticity of what's stored. And I think it's that would be harder to do if it's encrypted.
Thank you. venue. I think you're next.
And Chris, I think you're after that.
Then you go ahead.
Sorry, I was on mute. I was talking to you. Yeah. I mean, I was going to repeat what Jonathan said it. It's basically it's VPS is a transfer protocol. And there is a possibility of having public information and I don't see why We should make any assumption about what the bytes are this this layer, right? I mean, it's it's what it is, whatever the other layer sense. I mean, it's very likely that it is going to be tamper proof for higher and higher level higher level layer but not necessarily encrypted all the time, I think.
Chris. And then Manny. Yeah, we are running out of time. We only have two minutes left, but go ahead, Christopher.
Sorry, I forgot to press on mute on here as well as on my mic. So whether or not we're working with any content address system that is itself unencrypted, right, you put 32 bytes of encrypted stuff in there. It's very hard to tell whether or not it's encrypted or unencrypted. And that's like, easy to acknowledge up front. But we should make the stance that everything we work with in this group is encrypted. Even if it's public, and that is because the correct generalization is everything's encrypted. Some things are being shared in public spaces. It's a much better approach to take that kind of stance, you can, especially because it makes it much safer for people who are hosting things. Right. You know, if you're if you're, if the generalization is that everything is encrypted, and that you know, as little as possible, that also means that in the general case, it's much it's it's much less frequent than you're asking people to inspect the underlying underlying chunks that they're transporting, which makes it much easier to develop a cooperative network where we can assume that we can all safely share around chunks with as little knowledge about what they are as possible.
Thank you so much, and unfortunately, we're at the top of the hour mana Do you have like a two second comment. I
think what Chris said was perfect.
Excellent. So we will pick up this conversation next time. Thank you, everyone.
I'll change gears but thanks so