Hello, I'm Rob Hirschfeld, CEO and co founder of racket and your host for the cloud 2030 podcast. In this episode, we delve into spatial computing, or Apple's vision pro the face computer, we were really intrigued because all of us in the club 2030 group are interested in augmented reality and virtual reality. And this release this product, the Apple envision Pro, seems to cross over multiple lines, that made us surprisingly optimistic about the potential to actually have a hit here. And we had a really thorough conversation about what we liked what we didn't like, what we thought was going to be a challenge. And some of our surprises in this conversation, if you are watching this space at all, or new to this concept of a spatial computer from Apple, I think you'll get a lot out of this conversation. Enjoy.
The place where I've been seeing the most interesting change in the quality of what getting back is with the multi agent, frameworks. Auto Gen is one that I'm using, I'm finding I'm using a lot. And for prototyping, the auto Gen studio is very good. And then once you kind of have it tricked out the way you want, you know, kind of going through and cleaning it up and in kind of full blown auto Gen. The nice thing about it is that it really does support function calling so that any MLMs that you can or you know tools that you that you can build that kind of reach out into the into the net for search or scrape or kind of transposition, it's very good for that. And it's also quite good for using local LLM 's as opposed to foundation models, that and multiples so that you might have a coding agent that is using code llama, for example, but a kind of general director or or group manager agent that's using Mistral or one of the others and the truly the quality of what I'm getting off of both, some will call them one shots. But you know, pretty pretty, ambiguously worded requests has been really good. And when you get serious about saying this is how I want it to be done. And here's how I want to interact with the group of agents as a, as a human user. Really, the code I've been getting out there mostly functions to find, you know, the definition of Functions in Python for the most part, as opposed to really long piece parts have been really good. So I think there's a lot of value left in what's available to us today that most people are not using, or very few people are using.
Well, we're I'm seeing something interesting, not quite along the same lines as rich but on the domain expertise side. I was just on the phone was with a company that is doing fleet management for AMRs. So autonomous robots that are running around in grocery stores were in the back of Walmart and or Amazon or whatever, and they're feeding those not only the domain expertise of the A Ross or operating system that they're using, but rather, they're trying to break the wall, the walled garden around these fleets of hammers, and they are truly walled gardens from the OEMs, to be able to integrate them into the back office systems of manufacturing. And for that you do need domain expertise. So think about an ERP or an mes or a PLM system, that would be part of a digital threat or a digital twin. And now you need to start integrating this next level of toward what I would call a cognitive twin, where you're not just getting the visualization, but you're actually getting the data of in, in the palette that you just picked up, call it Mr. Amr. There could be piece parts for manufacturing that are not carbon neutral, or do not meet our compliance for sustainability. And we need you to take that instead of taking it to the production line, we need you to reroute yourself back into the warehouse to another part of it where these are non compliant, non ESG compliant products, and that level of domain expertise using agent basis, such as that you described those are really interesting, because then you're you're really taking the notion of cyber physical system to the level to which it was intended.
Have you seen anybody using indexing or approaches other than or in addition to kind of standard vector? Vector databases and and vector was kind of standard indexing looking for similarity? Have you seen anybody using graph networks, in addition, or hybrids where you have keywords and relational databases integrated with the with the LLS with this with the agents, I'm
starting to see it. If you take a look at Boston Dynamics, recent announcement with what it's was spot, it's called orbit. That is the beginning of it. And I think you're going to see much, much more than as a matter of fact, that whole sector of robots and Cobots being integrated back towards what would be traditional systems is a hot new area. And that will definitely require the vertical, the kind of database thing and indexing that you're talking about. But, you know, it's really taking it from the perspective of this is an HMI that is easily understandable. And that's the whole attraction of the chat GPT, or the general purpose, and then taking that up a notch, right with the agencies. But again, that requires domain knowledge. And really the question is going to become, how do you capture the knowledge, tacit and implicit of the human in a way that makes sense? It's like, do you create a, an LLM base or some length chain to ask the right question of the individual, about their job, and then start amassing all of that information? In some form, like how you naturally
have, I actually have been talking to two people, two different organizations about that very question. And my personal approach, my personal opinion on that is that it's a combination, that in point of fact, asking people and kind of kind of interrogating them, you know, trying to elicit from them is very much like what folks were trying to do as long ago as 20 years ago with expert systems, they were trying to, you know, the, but it's that's historical, is it's trying to be predictive, but it can They'll, what is useful and becoming and becomes more useful over time is the notion of using the assistant, the apprentice, the copilot to be a teachable, co pilot, and part of that teachability is adding to the kind of the domain knowledge that you're just describing corner cases, new new situations that show up. And using the notion of The Apprentice that overseas kind of looks over the shoulder of the programmer designer, the human expert, asking questions when it sees something that it doesn't see doesn't understand as being conventional, or why where it doesn't understand if you'll pardon the expression, why the human has taken a particular approach kind of the copilot as a as a very active source of extending domains with teachable agents of the kind of information that's kind of stored in as kind of long memory context that, at a certain point in time, is found to be useful and therefore kind of incorporated into the training dataset. Those seem to be the ways to do it. And it's a it's a process that keeps going as opposed to trying to do a brain dump off of off of the experts.
Yeah, no, that makes a lot of sense. I mean, one of the ideas that I've been kicking around, and it's very much along the lines of what you just described, was creating the co pilot to run on the back of the intern, or the apprentice lathe the human intern or the human apprentice, as
we enter intern and the expert, and you're absolutely yes.
And then bringing those together into what would be, in my view, the learning management system for domain expertise to add on to any LLM or whatever, where you're going to be able to understand where the where the copilot will quickly recognize the lack of experience of the intern or the apprentice, the tremendous expertise of the expert, and somehow try and find that middle ground, so that the training becomes iterative.
Yes, exactly.
The training of the human being as well as the training of the of the the agent data, big data source. And and that's, that's one of the things that seems to be quite evident, is that and this goes back to your force multiplier comments. The, the level of productivity that companies enterprises are reporting and I don't know, at this point, it's mostly anecdotal. But they the productivity that they're seeing with their experts, is significant. But the level of productivity as a, to the degree it's being measured for their mid tier employee, and is significantly greater. In other words, the, the multiplier, the the, the degree to which you're giving an employee a superpower depends on their kind of base understanding and base kind of based usage, so that they are actually seeing greater productivity or a greater productivity Delta. From there, kind of rank and file employee using co pilots. Yeah,
yeah, I think it's gonna be very interesting and I don't, you know, as much as I might use the term force multiplayer. To me, it's much more of a flywheel you're adding value every time you iterate. And the iteration did actually become more challenging with each new introduction.
Well, I mean, this is a good entry point to the last topic that we were we were going to take on today. And that is, are we really talking about augmentation, as opposed to nice transit placement? You know, and I, and I, you know, I actually, I literally got into this business because of Doug Engelbart and ALS and seeing that, you know, the, the mother of all of all demos, teen decades ago, and their whole angle, Bart's whole approach was not. And this was a point in time when AI was being bandied about. Just not very useful. The whole point was intelligent augmentation or intelligence, augmentation, just the whole notion of, of augmentation as opposed to replacement. And I see a similar kind of distinction between AR and VR. So
it was, so for the I know, I've been having fun re listening to reviews of the apple vision Pro, aka face computer. That's still my favorite face, which, which I think is fascinating in that it is legitimately a AR VR blend, which they're desperately calling spatial computing. But I, I didn't get it until somebody was actually talking in the review about being able to look around like Cook, they were actually the cooking one actually was interesting. So the interactions are fast enough, even though it's it's not actually transparent. It's fast enough that the cameras in front of your eyes can give you feedback fast enough that it doesn't look like you're you don't get the video lag of the feed through. So the guy was cooking using a vision Pro. And he was putting he could put timers on top of things that needed time. And then move away and the timer stayed where like over the over the dish she was cooking.
Yeah, which is a this is the kind of augmentation that I think is is a perfect example. And it's it's, it's, it's simple. And when you first see it, it kind of blows you away, but it becomes you know, well of course, that's exactly the kind of augmentation we'd love. We'd love to have in our normal day.
Yeah, so it's the beginning of being able to look at something and then recall and like have the AI identify what it is and then overlay information on top of it. That's you know, I keep going to enter with Venner Vin Rambis edge. One of my favorite science fiction books all dolled up and you haven't read this. It's a great AR. VR book.
Are you talking about Brynn
Verner Vin?
Yeah, well, Verner vagues being named. Yeah, yeah. See? Oh, yeah. Of his series on on AR is great.
Yeah, Amazon is not happy with me at the moment. There it is. Okay. Here's the book. Yeah. Nope, that's off the books. Oh, one of those days. Okay. Yeah. Okay. I'll give it to you from Barnes and Noble. But no, he like in that. One of the things I loved about that book is it was the first book I read where they were really overlaying the environment enough. They were like, Yeah, we don't have signs anymore. We don't have streetlights anymore because people are wearing these augmented reality units that basically are like, Oh, I'm going to show you where the curbs are and all that all the stuff and it doesn't. We don't have to put signs out anymore. You Mayas. Here's the correct link.
Book Yeah, rainbows in. Yeah. I like and appreciate augmented reality. I do not like, nor do I appreciate virtual reality. It literally makes me ill. And I think anybody who has any kind of visual impairment feels similarly, unless the glasses themselves are VR glasses, which are correcting your vision at the same time as they're, you know, doing whatever. I mean, you know, Metaverse is is a nice fantasy world. But I don't see us moving in ways that are truly, and maybe we're just not there yet embracing of the virtual realities in any productive way, other than the use of the technologies for, you know, diagnostics or whatever.
For AR, or for VR,
VR, AR, I definitely see it.
It's, but it's a much I think it's a much harder computing problem than the VR in that perspective. Right, because I've done I've had VR sets for years and years, I don't use them. Frankly, I don't use them very much. Because the I don't have the visual limited visual impairments. But I get, it'll still make me nauseous and sick because of the time lag or, you know, I'll get missed frames. Like, there's stuff where I'm like, my computer can't keep up. And so I just close my eyes and wait until till other because otherwise, it makes me feel bill. You know, I get motion sickness. Yeah. And you can't see your hands. One things I liked about the vision Pro is because of the way they're doing it. Even if you were in a virtual space, because they have a they have a roller that lets you convert from an AR to a VR space. If you lift your hands up, in that, in that headset, it'll show you your actual hands. Like they're actually using motions on your hands. Hold on just a second.
Yeah, it's Yeah, I don't know about you rich. But I just find, though, that the virtual reality doesn't do anything for me that I can't do myself. Well, the only thing that said, on the other hand is really good.
Um, I'm absolutely with you. I have not seen anything that is pure VR. That, you know, I guess if I were a bigger gamer, or, you know, something like that, it might be interesting. But
the gaming the gaming experience to me is preferable to regular gaming experience, I prefer really, really prefer the the immersive aspect of gaming in a VR headset.
Alright, I can see that. But other than that, I've yet to see a VR. Truly a VR situation that was it at all appealing. But I put a lot of that to the fact that the technologies involved just ain't there or haven't been there have any you've made the point that that technically everything you're doing with the vision Pro is VR. I mean, they they are in fact, you know, you are completely immersed in the in that in that video. It turns out that they're taking feeds from you know, where you are locally and playing that through to you. They are dinner, they are creating an AR environment, the cooking example with the timers so I absolutely get that I can see entertainment. You know, I can see immersive film, movies content. Absolutely. I can that that I'm sure is going to be one of the big draws for vision pro over the course of the next year, year and a half. The other thing that's going to happen and apparently they already have it in Bay data for developers is they have direct connections to max via the
I don't know if it's all USBC or just the thunderbolt, but you can power vision pro off of that, oh, don't have to have a separate, you don't have to have a separate battery pack, you literally plugging your, your vision pro into into the, into a USBC style port on your Mac, and you can get to simultaneously without any problem. Full on next screens, you know, with about as much real estate on each one of them as you could possibly want. Those are the kinds of things that I think are going to make a big difference for professional use. kind of surprising productivity use
I have, right this I mean, it's expensive. The idea of being able to use it on an airplane. Right? seems really interesting to me. There's a part of it where I'm like, I can imagine the airlines from a safety perspective being like, you know, we're not okay with you in you know, in that immersive, you know, world even if you could actually see what's going on around you. It's, I mean, but I expect for frequent travelers, the ability to plug in your plug this in and watch stuff talking to your computer actually have some spatial awareness of what's going on in the on the flight
seems and news per day Thompson, Ben Thompson and during fair fireball guy, John Gruber discuss that this past week. Exactly. And Thompson, who's on planes all the time, said it was transformative. It was the best said, anybody that travels, you know, more than once a month on an airplane on a significant flight. The 3500 bucks is work worth it.
I bet ya know, that's I've been considering it from that perspective, it would be absolutely crazy. Or even sitting in on this. I mean, some of this strikes me as horrible and great at the same time. Like if I'm sitting in a conference, right? I can look at the speaker and talk to the speaker and I can pull up resource information I can like there's like the ability to augment that experience seems really, really powerful. It's just, you know, I'm, I'm terrified of being in a conference walking through conference where it's like, everybody's in their own goggles. Although it's that point. You know, ideally, if I walked up past somebody, and I was also wearing the vision Pro, I, you know, we've actually interact.
This has been, this has been an experience, like experience wise, that does not the first time that we've seen a paradigm shift like this, like that, the laptops became powerful and portable. And after students started using them, it became a big problem for teachers, or professors at universities, because suddenly, you lost eye contact with your students, most of them ended up interesting at the screen while typing. So as a conference presenter, this would be problematic as well, like even with like the, the eyes Casterton on the screen of the headset, it's, it's not the same.
No, it's not, you know, where I can see the usage. And you really need to I think there's more progress that needs to be made on the spatial computing side is for product development, right? You you ideate a product you want to see it in in something that is more materials oriented or fluid, you know, in something dynamic, something
that's the motion simulation of something mechanical, for example,
well a motion simulation, but also on the material side of it right? Like you want to see how much flex there is in a resin or or how much photovoltaic property or whatever, I can see it being very useful for things like that, and the rapid development of new products. But beyond that, to class this point, I think, the and to what Rob said earlier, I think it's yet another disintermediation of humanity. And that bothers,
it's also going to wars in the class divide, like going back to the airplane. Like, sure if we're flying business class or first class. I can see using that but an economist seat, we barely have enough room to pull a book out of your bag. No, that's not going to happen.
Actually, data, Moriarty was the opposite Tom Ben Thompson, who flies to to Taipei, Taiwan all the time, said, I, you know, often have to fly economy where you cannot open a laptop in a economy. But I can open the laptop just enough to you know, kind of keep in keep the the screen live and plug in to it. And I can I have I have full access, I can use the keyboard. He said the one thing that he found difficult was that using a max, you know, putting Mac screens up on vision Pro, he said that sometimes the visual identification of where we're, you know, where the mouse is, or what the point is, is tends to jump about and so you don't have the same kind of hand eye collaboration that you do when using the mouse or a trackpad, the missus that
that was that was the comment I saw in in one of the reviews was that using your you, your you didn't, you didn't realize how often his mouse was in advance of his eyes, that his eyes were having to focus on something he kind of himself moving his eyes to the next thing before. And he didn't realize how much that was a, you know, a learned behavior
you're going to have the last time that they would be concerned about as well is the dystopian potential of this kind of thing. Like how long it like, let's say that the use of these kind of enhanced Reality goggles becomes widespread. How long until airplanes stop having windows, because you can just plaster it on the
I think it's totally perfectly rational and perfectly important thing to point out, you're going to start to simulate realities. The end and this is among other
things, that book is so good, y'all rehab,
which is the rainbows in the other thing that going back to a previous comment you were talking about interpersonal interactions, and what does it do there? Well, eye contact, obviously, but I can imagine manipulating the nonverbal communications in a, you know, in a human to human situation. I mean, if I want to sit there and say, at all times, you know, present me as being attentive and my body position is, is you know, looking at you and and I'm sending you all of the cues that you usually use to kind of get a sense of, Am I making a connection or not? When in point of fact, you know, I'm online on my back snoring and you know,
you both do the same thing. Well, this is I want to talk about the this their facial reading thing because they have actually, you know, I think it's it's somewhere in the uncanny valley but the this idea that, you know, they're projecting a face of your face, and then reading your expressions, pieces. Yeah, they're doing pretty well.
Yes, but you can also you know, the same technologies can be then used to kind of say, all right, present me as smiling and you know, in a surprise
that as happy, right plus as my mood, my mood or I'm negotiating with you, I want you to make me look more stern or
like, take take goop take whatever role you you'd like to present with body language. I have the Chris Voss book comes to mind when he talks about body language. Yeah. And how important that is.
I imagine we're about to get to right AI is auto tuning for your tone. So you can do the same thing from a tonal perspective. Oh,
of course, right? Yeah. When when everything you say is going to be auto tuned.
Exactly right. Think about Think of how important face to face communication is going to become from that perspective,
or how disappointing people are going to find real face to face communication. After ruling up as a as a kid with this
is also going to make political debates a lot more detached from reality than they are now.
I don't know how you might get hurt. I heard an interesting thing on the radio yesterday, they were talking about dating apps. And that dating apps, people are getting frustrated with dating apps. And they're switching to hybrid and so so they're, they're the trends that right, people were like, they thought dating apps would take out, you know, in person dating, or at least, you know, meeting people like outside of the like, and what outside of the app. And what we're seeing here is that people are like, that's not as full experience, I have to go back to in person. So it's we're really getting using these systems for weather effective and fast. I mean, back to our AI conversation. And then shifting over to say, you know, high bandwidth community or high bandwidth in person is actually preferable from that perspective.
But there's also, but But part of that I've read a number of articles with with regard to this, because of the human communication. And my view of disintermediation is there is so much distrust of the dating apps per se, because there was so much scanning and so much, you know, malfeasance going on in the background of it, that people have come to the point of realizing that unless you're face to face, IRL, you really have no clue who you're dealing with. And to the point of, you know, the the vision Pro, would you really put all of your trust and faith in the attuned versions of yourself being represented? Going forward? I would never negotiate my
PD. It's always my because ya know, it's it the stressors on this stuff I think are high. It was interesting to me in this conversation, stepping back from it is we're all this is a transformative device in the way we're discussing it. We are we're they have accomplished something here. That in our the way we're discussing, it indicates that we've already sort of accepted a new reality. And we're making an assumption that it's, it's hit whatever, milestones that were necessary for us to be sort of passive, like Google Glass was cool, but it was like didn't it wasn't it wasn't material enough. This one, we're actually you know, and I'm curious, because I mean, I'm very, I'm tempted to buy one, I think it would be cool and interesting. It's just hard for me to justify the money even letting rack and pay the bill. Still, like that's a lot of money. Yeah. But I would get one.
I go ahead.
Sorry, I didn't mean to cut you off. I just want to make one point though. Yeah, we are all of a certain age and experience. Ask Gen Z. What they think of it, and you will have a group that is 100% enamored, actually, it's now the Gen A, the the younger than the Gen Z, who will become completely enamored by it because of its novelty and also its utility. But as you go through the age groups I think you're gonna find more and more commentary around more leaning towards what our collective perspective might be, as opposed to the younger group, right? We we've grown up with a tremendous amount of digital technology that we see both the pros and the cons from a societal and human impact point of view, as well as a utility point of view of productivity. Those that are that much younger than we are, may know no other reality and have only such a limited experience before they get into this, that their views will be very skewed in comparison.
But I was going to say something along the same lines, particularly that I see a lot of parallels between these devices, and heartless when it first came out. And after I'd made that when, when, when tablets first became popular, I considered one and I actually purchased one, so potential desktop replacement. But at that point, I had not considered the implications of losing the haptic feedback on the keyboard. And so learn by lesson. But I think we're gonna see something similar here, would with Google Glass having been the equivalent of the original tartlets, so limited features on these new devices being closer to let's say, Chromebooks now, where which, or whatever, like, I see Chromebooks as having been the evolution of tablets, like, high scientists, having learned that you cannot just have an Android OS or an or an iPad, without the physical component, to complement it for productivity. And in this case, the like it's in the case of the Apple device, it's not so much the physical components, it's the environment part of the
UX No, it's, it's a it's a new user experience, from a computing perspective.
But it's so efficient.
Yeah, your point, Rob, in the acceptance. There were smartphones out before the iPhone, now I bought it, I remember picking up I remember picking up the iPhone the first day, and unboxing it and and, you know, just kind of being in awe, recognizing it was the version one and then had, it had failings. But it was still it was clear to me at that point that while not might not be as functional across the board for everything I needed in a in a phone and a PDA. I you know, could you could see where it was going. And this is this, I think is the same. We're to your to Joanne's point. Yeah. We have a lot of experience. And we have history, we've seen a lot of tech. But we're still talking about VR AR mostly in terms of past technologies. We're still we're still in the horseless carriage. community as opposed to, you know, the generation that knows no, has no past history. And it's going to start coming up with approaches to its use that we haven't even we don't, we don't think about we can consider Yeah, well
look at the rabbit. Look at the popularity of it from release. The why is that new device that really,
I mean, a lot of people bought it or put in orders, but it hasn't really made it out. And I don't know. I don't think it's going to make it actually be quite fun, Frank, but yeah,
I'm sorry. I don't know what device.
There are one.
The it's a it's like a handheld. Ai specific. Oh, I've
seen I've seen things about this. Okay. Yeah.
And, you know, had some slick design from Teenage you know, teenage teenage design, teenage electronics, whatever and which is very inexpensive has a built in camera so you can do Image, Image work with it. I think it's going to be short lives.
And so, okay. I think it's a new era of the devices. And I think what's going to happen with the vision pros, is we're going to see specialized versions come out to suit the needs of the consumer community based on the feedback of the Jenny's Gen Z's early millennials, much
more industrial and the industrial or enterprise uses, you're gonna see a version of the vision pro that's for airline mechanics or something like that. Yeah,
yeah. Yeah. And it's just, it's just the device version of the domain expertise that's needed to complement more, and the LLM technology. I mean, general purpose is a catchphrase. We need specialized expert systems. Yeah. So giving us the general vision Pro. I mean, like I said, I, you know, I'm big on AR, but VR kind of leaves me. It kind of like, Rob, that motion sick, you know. And
the other thing also to consider on and this goes back to the comment about Gen Z, and Gen eight, is that, unlike us, who, who are a generation that uses computers, that's grown with computers and, and we as professionals use, there's a large demographic of Gen Z and Gen A, that have never seen the need to have a computer because they have their smartphone. Yeah, great. So that's, that's the market, that these devices are going to flourish. And, yeah,
I'm interested in seeing, as I think most of us are here, what is going to come out that's less expensive, focused on AR as opposed to be AR. And kind of approachable, in the sense of being modular enough to add technology to it removes piece parts, make it modular, but also incorporate some very smart kind of back ends for him. So that's it, I give a year 18 months to
So here's something fascinating and I need to wrap up to date because I've got to get a jump off the call. But I do want to build on what Rich said. And then we'll wrap which is this has all the bells and whistles in it. Because they don't know what what sort of water bells or whistles yet. I think it's going to be fascinating to see what what the device actually doesn't need to accomplish whatever the killer app is, once the killer app emerges, and they're they're pretty unlikely that they're opaque or transparent. And saying they don't know what the killer app is yet the cool device.
I'll tell you one killer app and then I promise I'll be quiet and let you go is I hear rumors around the streaming services being the killer app for the vision Pro, the immersive movie the immersive entertain it's all it's going to be about that and it's going to be unfortunately or fortunately depending on how you look at it the in vehicle entertainment system
All right. Yeah. You skeptical? Over Under on the first app. Apple vision pro crashed lawsuit. Oh,
man.
There's been news all people in Tesla's using full self driving while wearing the apple headsets. Oh,
yeah. Love it in say with that. I'm gonna let us wrap up next week. Bye Bye.
Wow, what a fun conversation and a really interesting way to end what I think is a remarkable product and one that I think we're going to be watching pretty carefully as a group. And I'm looking forward to hearing from people who have actually used this. If that is you, we would love to have you come join us it is an open forum come in be part of our discussions. That is how this cloud 2030 discussion group works, we would love to have you on you can find out all the details and our schedule at the 23 dot cloud. I'll see you there. Thank you for listening to the cloud 2030 podcast. It is sponsored by rockin where we are really working to build a community of people who are using and thinking about infrastructure differently. Because that's what rec end does. We write software that helps put operators back in control of distributed infrastructure, really thinking about how things should be run, and building software that makes that possible. If this is interesting to you, please try out the software. We would love to get your opinion and hear how you think this could transform infrastructure more broadly, or just keep enjoying the podcast and coming to the discussions and laying out your thoughts and how you see the future unfolding. It's all part of building a better infrastructure operations community. Thank you