AI Evolution: ComputerVision, Olfactory Computation, and Neural Nets with Max Olson
Computers with senses? Yeah, that’s a thing.
I brainstorm with my buddy Max about all the new and interesting applications of AI. He has spent his career in Computervision, where AI is going to have some crazy implications. We even talk about how computers can smell (olfactory computation).
Max is a brilliant generalist. He writes, codes, invests, builds hardware, has worked at startups. He knows a lot about a lot.
Today, we wander through the realm of computervision. He explains what computer vision is, how it has evolved, and what some applications for the technology are. We also explore how Google will fare in the AI transition, and how AI can help boost writers’ and artists’ creativity.
Links to Platforms:
Useful snacks from the episode:
Max is currently working at a startup called Mashgin. Mashgin makes automated self-checkout kiosks. Computervision recognizes your items, no barcodes or “boops” necessary.
Every company is a “tech” company. Some companies are just shitty at applying technology to their problem.
Computer vision has gone through a major transition. Today, computer vision uses neural nets. But in the past, it was too computationally expensive to train neural nets on this problem.
Max distinguishes the definitions of neural nets, deep learning, and machine learning.
We debated whether Google will be disrupted by AI or not. Huge innovator’s dilemma, and might require a bold business model change.
Max foresees a future where GPT gets good enough do some things a virtual assistant does today.
One day, you may be able to train a cooking robot to iterate until it finds the perfect chocolate cake recipe.
Generative AI may destroy some jobs for writers and artists of more ‘commodity’ work. But it also can be a tool to enhance writers’ and artists’ production.
The sponsor for this week’s episode is Founders Podcast.
David Senra, the host of Founders Podcast, is a biography-reading machine. If you don’t have time to spend 40 hours reading the full-length biography of some gilded-age entrepreneur, listening to David’s high-quality recap is the next best thing.
It USED TO BE a paid podcast, but recently David switched to Ad-based. So search Founders in any podcast player, find the podcast with the white script on black background, and pick an episode that sounds interesting.
Visit founderspodcast.com to subscribe or listen to sample episodes.
Learn more about Max Olson:
Additional episodes if you enjoyed:
The Most Epic, Personal, In-Depth Balaji Interview Ever (Transhumanism, Investing, and more)
The Next Industrial Revolution w/ J. Storrs Hall: Nuclear Energy, AI and Nanotechnology
Solocast: Metagames, Feedback Loops and Transcending The Muggle World
Episode Transcript:
Max Olson: The example that I use is like imagine the day that an Apple or Google actually productizes that, where right now in your Google Photos or whatever, you're searching like, hey, show me pictures of me and my wife on our honeymoon or in Italy, and it searches those pictures using computer vision. Once you add a feature like this in, then it's like, hey, show me a photo of me and my wife on the summit of Everest, and it will show you a photo of you guys like holding the flag up there, where no human being can tell that it's a fake photo. That, once it's productized, that's the level it will be.
Eric Jorgenson: Hello again and welcome. I'm Eric Jorgenson. And I don't know much, but I have some very smart friends. If you listen to this podcast, so do you. This show explores technology, investing, and entrepreneurship so that you and the rest of humanity have a brighter, more abundant future. This podcast is one of a few projects I work on. To read my book, blog, newsletter, or invest alongside us in these early stage technology companies, please visit ejorgenson.com. Today, my friend Max Olson and I explore AI and computer vision. Max and I go way back. We love to get together and share what we're reading, learning, and imagining about different areas of tech. He's super fun to brainstorm with and learn from. I love these conversations. I'm excited to have you join us. In this episode, we really play with all of the applications and opportunities that Max and I are seeing in AI and computer vision. We start with what computer vision is, how it evolved, some of its applications, and we get into some interesting places like olfactory computation. That means computers can smell better than humans. It’s very interesting. We talk about some scenarios where Google could thrive or die during the AI transition. And we talk about all the potential products out there applying the new image and writing API, AI tools out there. We're starting to see them. We think there's a ton of opportunity left. And I hope you'll pick up some of these balls and run with it. Max is for his part a brilliant generalist. He writes, codes, invests, he builds hardware, he's worked at startups. He knows a lot about a lot. And we have a ton of fun ranging through all these ideas. We don't talk about much today, but Max is also the publisher of the book of Warren Buffett's shareholder letters. He's a fund manager. He always has a ton of interesting side projects. And he's been in an operating role at a startup his whole career. He's got a lot of experience that he brings to some of the insights that he shares, and his blog and his Twitter are exceptional. Highly recommend checking them out in the links in the show notes. If you enjoy this episode, this format, this experience, please let me know in email or Twitter, and I'll have Max back on again. We can talk through another area of tech. We brainstormed also doing space or nuclear, both really interesting kind of emerging places right now that I think would have a really interesting, similar conversation.
I am a super fan of the Founders podcast. It is now my most listened to podcast. It used to be a paid podcast. When they had sponsored before, that was the story. And when David came on my podcast, that was the story. But recently David has switched to an ad based model. So you can search for Founders in any podcast player, find the podcast with the white script on a black background, and pick an episode that sounds interesting to you. David Senra, the host, is a biography reading machine. He reads hundreds of entrepreneurs’ biographies from all across history. And this podcast is kind of Dan Carlin style him talking through his notes, quotes, key insights from each book. It's like having a smart, obsessive friend call you and tell you all the interesting things they learned that week from all the books and documentaries that they watched. His superpower is connecting the stories between people. He goes from Arnold Schwarzenegger to Estee Lauder to Charlie Munger. He's an encyclopedia of knowledge. And if you don't have time to spend 40 hours reading the last skyscraper of a book that Walter Isaacson wrote, listening to David's high quality recaps in one to two hours are the next best thing. Some of his most popular episodes of all time: Estee Lauder, a story that I did not know but an incredible story of grit and tenacity and talent. Ed Thorpe is a great episode, one of my favorites. He is sort of the best role model of all of the hundreds of founders that David has featured, which I think is worth special note. And if you're a fan of my book, David has an episode on the Almanac of Naval Ravikant as well.
Thank you very much for supporting the sponsors who make this show possible. I'm very careful to only pick sponsors I believe in whose products I enjoy and think you will too. If you enjoy conversations like this one you're about to hear and you want to be a part of a community who talks about stuff like this all the time, please go to ejorgenson.com. Share your email. I'll keep you in the loop. There's tons of fun stuff coming. Please enjoy this conversation arriving at your ears in three, two, one.
I’m very excited to do this on the podcast, what we do in normal life, which is call each other and be excited about all the technological things and then like shout and wave our hands about how amazing the world is going to be in the future when we actually figure out how to use and do all this stuff. And then, all the various ways in which one plus one equals three when cool tech stuff comes in. But since this is your first time on the podcast, which is my massive oversight, I feel like a quick intro and your background would be helpful, so people can get a feel for where you've been, what you've seen, what you're into.
Max Olson: Yeah, so I work at a startup called Mashgin. And we make a visual self checkout kiosk where you don't have to scan items’ barcodes one by one, you just place all your items down, you tap your phone to pay, and you go. It's that like super easy checkout, basically. And that's my day job. I have also published a book since 2012. And I run an investment fund on the side.
Eric Jorgenson: That investment funds been- You've been at that for a while. What is the focus of that?
Max Olson: Yeah, so it's a long only fund, mostly public investments. I mean, that's how it started is just kind of what you would call a hedge fund, a little bit of like options hedging. And over time, I've kind of- private investments have creeped into the strategy. It's mostly kind of your typical long term value investing, that's the majority of the actual position sizing, but then I also do lots of different kinds of special situations, quantitative strategies, things like that. So it's a little bit, it's similar to my own style. It's a little all over the place.
Eric Jorgenson: Yeah, that is like less tech focus that value focus, though, right? I'm sure there's some overlap. How would you classify that?
Max Olson: Yeah. And when you say tech focus, you mean like in terms of the companies itself? Or?
Eric Jorgenson: Yeah.
Max Olson: I mean, it's definitely not tech focus, per se, although, it’s similar to the private investing. I think it's slowly over time gotten more and more on the more tech kind of side, versus the more what you would think like of Warren Buffett style investing where it's like- definitely not anti tech. I mean, he's invested in multiple tech companies, but it's more of like invest in things that are not going to change in 10 to 20 years, or that's the goal, at least. And yeah, I think my style has gotten a little bit less like that over time. And I've been a little bit more willing to take on, I hate the word risk, but the uncertainty of whether something is going to play out.
Eric Jorgenson: Yeah, specific technology. I like the- I think it's Balaji-ism that like saying there's a tech industry is like saying there's a physics industry. Like, every company interacts with technology, it's just to the extent that they acknowledge it and focus on it.
Max Olson: Oh, yeah. I've gotten in discussions with people in the past where it's just like, hey, that's not a tech company. I'm like, well, not technically how it's defined, yeah, but it's like it's a tech- every company is, in some way, a tech company. I mean, if you operate this old industry, manufacture- a cabinet maker, it's like, okay, you're not a- you wouldn't go around calling yourself a tech company, but you're using technology. And in a lot of ways, you have to keep up with the latest technology and processes and how things are done, how other competitors are doing things that might disrupt your business in some way. So, it's a- if you go into business thinking that every business is a tech business, I would think you'd probably be better off than assuming it's not.
Eric Jorgenson: That'd be an interesting gradient to look at, like what is the most- what is the range of specifically cabinet making companies. Like what is the least technical cabinet making company in the world and what is the most technical, that'd be a good illustrative blog post, probably. You were in computer vision way before it was sexy. So when did you start at Mashgin? Mashgin has been a computer vision company since forever, like you guys were real early to this, which I think is probably a unique perspective on the industry and the tech.
Max Olson: Yeah, Mashgin started, it was first founded by my friend and one time housemate. And he's been in computer vision, like you said, for literally forever. I mean, he did computer vision at Bell Labs, Toyota Robotic, doing like vision for robotics. And when he started Mashgin, and it was, I think, it was just him, of course, at first, 2013. And it was what you'd call traditional computer vision. Computer vision today is using these huge neural nets. And that's just like a totally- it's got very similar goals, but it's a totally different method than what they used to use. The best way I could describe it is, as probably most people know, an image is- all images can be broken down into numbers. And if you have an image that's sized 100 by 100 pixels, that's 10,000 pixels, and then each of those pixels is red, green, blue, some combination of that. And so, you have that image. It's just 30,000 numbers. And so traditional computer vision was more like, let's just say if you wanted to determine if something was an apple or an orange, that's an easy example, it would be you're manually programming things, like if-statements that say, like, hey, look at all of the pixels of these images. And if, on average, this image has more red pixels, then it's probably passed- an apple, passed some threshold. So that's an example of traditional computer vision, where you're kind of- you might apply some photoshop-esque like filters to it, like turn it black and white or apply some contrast filter to it. But you're basically kind of manually thinking of like, oh, how would I break this apart and determine what's in this image? Actually, I should have stepped back. If nobody, people don't understand what computer vision is, it is what it sounds like; it's helping computers understand what they're looking at, basically. And so, yeah, that's traditional computers. And that's the era that kind of Mashgin was originally conceived of. But it was at the very beginning of this huge switchover. And I think, I'm not an expert on the history, but it started right around kind of 2010 when these- neural nets have been around for a very long time. But when deep learning really started to kind of takeoff, and we don't need to go into super detail, one of the biggest reasons, as many people have seen, is GPUs and just how powerful they've gotten over time, and using those for compute power to train these networks. Whereas in the past, in the 90s or the 2000s even, it was just too computationally expensive to train these neural nets. But anyway, starting in this 2010ish timeframe, that's when neural nets really started to kind of take off as a tool.
Eric Jorgenson: So this approach, like the software approach, had been around, and it's just the hardware caught up to make this software approach more effective?
Max Olson: A little bit of both. A little bit of both. It was on the hardware and software side. So the concept of neural nets had been around for a long time. I don’t even remember, like the 60s or 70s is when they were- and they were used throughout then. It was more, honestly, like an academic theoretical thing than like practical application. But I think right around that same time, some new ways in terms of the actual software and the algorithms of doing these and not just using them on GPUs, but just how the models and algorithms were structured, basically. And we can talk more about that later. But it's like a lot of those changes are going on right now with all of these AI art models and diffusion models and such where, again, it's they're still in neural nets, but they're different how the neural net is structured and how data is interpreted and things like that, that was a little bit different. And so that kind of really started to take off during that period. And there was something called image net, where it's like it was basically benchmarking how accurate image models were at recognizing images. And I think it's probably changed now, but in the early days, it was specific, like this is a set of images, and we're going to test all of the models on the same images and how accurate they got. And they even tested humans on that model. And even humans are far from 100%. I don't remember the numbers, but it was like humans might only be 85 to 90% in like quick identification of what's in the images. But right around that time is when the model finally kind of beat humans. And similar to some of the other areas that this is happening now, that was big threshold crossed, and it kind of proved at that time that deep learning was the way to go, basically. And from there, just the models and the hardware scaled up, and it's just gotten better ever since.
Eric Jorgenson: I saw here sort of neural net, deep learning, and machine learning. It's used what sounds like interchangeably. Is that reasonable? Should we define those differently? Could you kind of frame that out for us?
Max Olson: Yeah, and I won't be the best person in the world to define them. But machine learning is the broader like subset of all of them. Machine learning kind of includes statistical analysis and stuff, too. You could say machine learning is a subset of AI in general. But pretty much I think everything people do now within AI is considered machine learning. Within machine learning, you have neural nets. And that's just what kind of model you're using to predict the what's in the images. Deep learning is you could almost use as analogous to neural nets. But deep learning is more just like a neural net could technically even be- it's a stacking of layers of neurons. And a neural net could technically only be two layers. You could consider that two or three layers a neural net. But that's not really deep learning. Deep learning more refers to these bigger models with like many layers and thus many more- much more like complexity. That's kind of the way I would describe deep learning. And if you want, I can kind of describe a little bit more about how neural nets work and such, but yeah.
Eric Jorgenson: Maybe like with that analogy, the thing that I'm left wondering is what is the analogy to a neuron? Is that like one function, and then you're layering a ton of different functions, and that increases the confidence of the total systems answer?
Max Olson: Kind of, yeah, kind of. So, a neuron is more, let's see, a neuron is more- less like a function and more like a- it is more like just a number. It's like in an equation, your Y=MX+B, a neuron is like a weight. So, in that, in your classic kind of math, y=mx+b, m would be a neuron basically. So, like that would be a weight. And so just to kind of maybe even back up a little bit, and that's kind of what, in some sense, all machine learning is, predictive statistics. People kind of- if you don't know what linear regression is, it is what I just said, it's like the y=mx+b. If you plot two points on a chart and draw a line between them, that's you predicting, like, oh, if anywhere else goes on that line, that's the prediction. That's the basics of linear regression. And that's as simple as you can get, and it just kind of scales up in complexity from there. I mean, for just linear regression, that's just one variable. So, if you're trying to predict the price of a house, and you collect a bunch of examples of how many houses that have sold and how much square feet they have, using square feet, that's just one- that's using one feature. And you can easily just kind of plot the- just do the math by hand on a piece of paper for something that simple, plot the linear regression, and then say like, okay, you have your average- it's average amount per square foot, and therefore, I'm going to predict this house is going to sell for this much. Now, that's probably not that accurate because for the house, you've got all these other factors, number of bedrooms, whether it has a pool or not. And so, if you wanted to, you can just start adding in those features. And that gets into like a multivariate regression, and it starts to get a little bit more complicated, but still enough that you could do it easily if you wanted to in a spreadsheet. And that's essentially, these neural nets are essentially just like scaled, very scaled versions of that. A regression is like one layer of a neural net, essentially, and a neural net is like you continually add layers of that regression. And back to the image example, actually, you have- like I said, an image just represented by all these numbers. And let's just say, one of the most simple, probably, computer vision recognition problems you can think of is like how do you tell if something's a square or a circle? Okay, so if you break those into numbers, what you're essentially doing is like you're feeding in those numbers on one end of the model. And on the other end, you just want it- there's a binary outcome – is this a circle, or is this a square? And so you're just reducing, let's just say like in the example I gave, you’re 100 by 100 pixels, that's, if it's black and white, 10,000 numbers, you're feeding into this function, the model, and it spits out one number at the end of it. So it's essentially just mapping, reducing those 10,000 numbers to one number in these successive layers. And the way that works is like- again, this starts to get complicated, but each layer is like a simplified way to say it's like breaking- it takes all those pixels, and it's breaking it down more and more. So it learns kind of like, okay, this is what a line is, this is what a corner is, this is what a curve is, down to the last few layers where it's like, okay, the second to last layer might, essentially, say like, okay, this object looks like it has four corners, and therefore if it has four corners, then it's a square. And that's a simple example. But that scales all the way up to like, is this a dog or a cat? And that last thing is like, is this a dog or a cat? Yeah. And yeah, again, it's just like mapping all of those numbers, reducing it to a few numbers. And that's exactly what humans do, really. If you look at a dog or a cat, it happens instantaneously in your brain. But that's essentially what you're doing. All of these thousands and tens of millions of bits of information that's flowing into your brain instantly, and your brain is reducing it down into like either make a decision or what is this thing?
Eric Jorgenson: And we spent 20 years learning what's a dog, what's a cat. Like you treat the machines, the AIs like kids and just like part of is the complexity of the function or the network, but also part of it is just the sample size. I know there's a huge- part of the boom in AI and computer vision has been benefited by just like the scale of things that we have added online, made available, made accessible and fed these things so that they can- I mean, it's safe to say like they are now writing their own rules. Like you can tell the function to teach itself how to distinguish a cat from a dog. We don't have to write pointy ears, size, weight, colors, like those are things that are kind of emergent from, hey, figure this out.
Max Olson: Yeah, exactly. And that's kind of- and again, back to that's one of the differences between traditional computer vision and now using neural nets is when you're training those models, those huge neural nets, and something like what I was talking about, like square or circle, that wouldn't be- because it's not very complex, that wouldn't- that neural net would not need to be very big. It would only be probably a few layers, only a handful of neurons per layer, and that's good enough. But the bigger the problem, the more the complexity, the more data you need, as you said, and the bigger the models, and yes, it will, in this training process, when it's kind of each of these neurons is- it's like going back and forth like learning basically. It's almost kind of like if you are familiar with the concept of a fitness landscape, where it's like the height on the landscape is how good the solution is. It's like traversing that landscape, trying to find the best weights for that model, so that when it's used, it finds the best solution. So in traversing that, it is, quote, unquote, learning what these things are, like you said. What does a leg look like? What does a- it doesn't know what a leg is per se, but it kind of essentially learns those things over time.
Eric Jorgenson: Super interesting. And of course, I think I get this from the Bottomless folks that have the company that's like a Wi-Fi enabled scale. And it's just like as a general rule, the more things you can make legible to computers, the more that AI can do for us, and the more that can be done by really high leverage automated like crazy machines. So I feel like the computer vision in general, the companies that I know of in the computer vision space, and you might know more, are much more like- they're very specific. They need to train to a specific data set. Like you're like, we are going to learn all of the items in this store at Mashgin, the items for sale, and we're going to be able to recognize all of them. Self driving is kind of like recognizing cones and people, and objects. That's maybe the most general.
Max Olson: That one's pretty general.
Eric Jorgenson: Yeah, extremely general. Our mutual friends at Density are very like recognize a human in a space. So these are kind of like getting verticalized, but it seems like they're going to get more and more general or cross boundaries more and more maybe.
Max Olson: Yeah, I mean, with these bigger, these much, much bigger models that are coming out, they are much more sophisticated. They can deal with much more complex understandings of what is in the images. And so, with those, I think you're right, that it will be less verticalized. But I still think, in the actual industry, industrial applications of that, whether it's Mashgin for recognizing things for checkout or Density or whatever, I think those, you'll still get companies that are in those vertical spaces, but less because of the model and how detailed they need to know the model, and more just because the actual application of it is difficult. And yeah, you just- it's not like- Mashgin is not competing against Open AI, let's just say. Open AI might be able to very easily develop a model that can recognize the same types of items that Mashgin can, but that doesn't mean that they're going to all of a sudden, if they wanted to enter the self checkout business, that they're going to be very good because it's just it requires a lot of customer knowledge, user experience knowledge, all that kind of thing, that it's not- the model basically is a super important part, but it's not everything. It's not the biggest part even. But yeah, it is a super important key. And there are some- there's definitely a lot of data and network effects. Like if you've collected data over time, and you keep collecting it, that's for sure, in the business sense, a barrier to entry. And if you're a startup or whatever, if even if you’re one of the bigger companies, it's not as easy just to come in and snap your fingers and drum up the data. With sufficient amount of resources and money, you probably can, but it's some level of barrier.
Eric Jorgenson: Yeah, especially in the places where the data is protected. Like, it's just expensive to collect self driving data. So I know people talk about that as a moat that Tesla has is just like recording a lot more road hours than others are able to get in or where the data is- like the HIPAA compliance data. Like X ray screenings, it's difficult to get huge sample sizes. But there are AI models that can train on X ray scans and be like, oh, that's a dangerous thing. That's this, that's this. People train for years recognize that kind of stuff. I was starting to think as you were talking of like what are the jobs that humans do that's basically like taking in visual information and outputting a decision. It's like that security guards, that's radiologists, that’s drivers.
Max Olson: A lot of things in manufacturing. The people there in manufacturing, even if they're using a lot of automation and robots and whatever, the people there, it's a lot of visual, like whether it's monitoring the robots or monitoring the quality of something, something at different steps of the way, it's like, hey, I'm just going to visually inspect this. So there's a lot of that in there. And that was a path that Mashgin once looked into.
Eric Jorgenson: Interesting. Yeah, I bet people have used Mashgin and not even realized it. It's in a ton of stadiums, airports, other commercial sort of like kitchens and cafeterias as well. What's the most common of your installs?
Max Olson: The most common now is convenience stores. So yeah, I should have said too that the machine is not- I say self checkout, and that's what it is, but it's made specifically for small format stores. And so exactly like you said, convenience stores, airports, the kind of sports venues, like food and drink venues at sports stadiums and arenas, cafeterias in schools, businesses, hospitals, things like that, basically, anywhere where you'd go where you don't need a shopping cart, where the extent of the items you're going to get you can carry with your two arms. So that's a good way to say it.
Eric Jorgenson: Okay, all right, cool. Let's sort of examine the transition between maybe how computer vision is feeding into this AI. I feel like AI is having a moment this year, even though I'm sure there's plenty of people who would be saying like, yeah, it's been coming for 10 years, like watch the development of the chess playing things. And five years ago, everyone's talking about how like the ultimate is the centaur, like human plus AI combination. And that's kind of where we are with like the writing stuff right now. Like Nathan Baschez’s new product Lex is kind of like AI outputting art or language is like cool but not sufficient without a human sort of head at the moment. Though, it'll be interesting to explore how that changes. I know both of us have played with kind of the text to image and just like the reverse of the computer vision stuff you've been working on.
Max Olson: Yeah. And that's kind of- I mean, it's more complicated than that. And there's a lot of changes in the models. But that's almost kind of what it is. Like I said, it's like if a neural net understands how to translate the numbers in an image to like, is this a dog or a cat, then it's essentially- this is essentially just reversing it and being like, okay, well, what does a dog look like? It's obviously a lot more complicated than just reversing it. And these new changes in the algorithms, like this new way that all these text image models use this thing called diffusion. And that's literally- that's even a little bit over my head in terms of fully understanding exactly what it does. But it's this- again, it just kind of goes to show you that there can be innovations in these models. And that's hence the changes and improvements over time. I think there's hardware improvements, as well. It's like these, particularly NVIDIA, GPUs have just gotten better, more built specifically for deep learning and stuff. So that has improved over time. The scale of the models has like drastically gone up. I don't remember the exact numbers, but I think it's like they've done studies about model sizes, and it follows- It's basically like the Moore's Law equivalent; it's a lot faster than Moore's law. But it's like the size of the models are just on this crazy curve upward. And yeah, it's like, for the most part, to scale the models that big, you need more data. And you need equivalent scaling of the data. And so, yeah, it's crazy how fast it's making progress. And it's a little bit of the- you get all of these models coming out at the same time. Like, I'm sure listeners have heard of these text to art and image models that are out there, like DALI-2, Midjourney, Stable Diffusion is the open source one. And all of these have come out relatively the same time, or at least that's the way it feels. I mean, of course, the researchers have worked on it or similar problems for years. But I think one of the reasons why these all seem to come out at the same time is these changes, these underlying kind of improvements in hardware and such, but it's more than that. I think part of it is the kind of four minute mile thing where it's like once it's done, people know that it can be done and so like they're working- they work more to do it. The others is that a lot of this stuff, even if the model itself is not open source, like the DALI-2 example, they'll come out with a paper or even just like a post on kind of roughly how they did it. And their model is based off of papers that people have come out with multiple years ago, potentially. And so if they point to those papers, then the other researchers can kind of figure out basically how they did it, they can try to collect some of that data themselves, and then they build their own model. So it's crazy. And now that you've got the open source thing, people just like have taken the Stable Diffusion model and just totally run away with it. And we've already seen, in a matter of a couple of months, such like- you've probably seen some of them, the crazy stuff they're coming out with. And I can't even imagine in a few years from now.
Eric Jorgenson: Yeah, I think it's also really interesting to- seeing the specific ones come out, like the visuals- I've seen friends who- [inaudible 30:33] was talking about this. He's like, all of his blog posts are now illustrated by- he doesn't go searching for stock images anymore. He just like drops a prompt into- I don't remember which one he uses, Midjourney maybe, and uses it. He basically hired an AI as his illustrator in the last month. So I think it is interesting to think about the specific jobs that AI is getting like hired to do by people right now. I saw I think it was Ali Abdul got access to Lex, Nathan's product. And he used it to write a whole tweet thread of like productivity hacks and tweeted it. And it went viral. And then like two days later, he's like, “I have a secret. That was written by an AI.” Like ghost writing or collaborative writing, illustration.
Max Olson: That’s what Lex is then, it's like it's a text model kind of applied to AI like assistant, like it's trying to help you write.
Eric Jorgenson: It's basically- so imagine Google Docs. It's like a kind of cleaner, tidier, more design-y form of Google Docs. And you're writing a blog post. And the function that the AI plays is basically if you get stuck, you just type like plus, plus, plus, and tab, and it will populate the next three sentences from GPT-3, I think is the model they use, based on what's already there. And you don't have to use it, it might not be factually correct, but it is like the AI prompt. And it just has a way of sort of like you can agree with it. You can disagree with it. You can edit it. It just like keeps your momentum and juices flowing. As a writer, I think it's really interesting. You can also do the same thing for counter arguments. Instead of like, hey, continue this document, it can be like come up with counterpoints. There's a function where it can create titles based on the body of the text. So, it'll brainstorm like 10 titles for you, which is great. I mean, it's hard to sort of get variety and novelty yourself when you're just brainstorming topics. That's like a high energy task. So, I think it's really cool to just see that augmented. It's the centaur kind of approach to it. And I think it's not quite generate a perfect clean document as AI. But being an input to a writer's process is an important function and a really helpful tool, honestly.
Max Olson: Yeah. And it doesn't- maybe eventually it will potentially be able to spit out this perfect document. But like you said, it doesn't need to. I mean, going back to like just Microsoft Word decades ago, like that had a thesaurus. And it's like it's just that extrapolated. I mean, if you were like, hey, I don't want to use this word here, I want to use like a better sounding word, you look it up in the thesaurus, and you change the word. And this is just like a super advanced version of that. It's like how do I re-word this sentence even, to just like this kind of sounds off or whatever, I mean, find a better way to say this or more succinct way to say what I'm trying to say because it's a little verbose or whatever.
Eric Jorgenson: Yeah, it's funny, the combination of these two is funny, because when you look at a bunch of stuff that like Midjourney or DALI-2 has generated, the only thing that's like- the two things that are terrible are, one, faces and, two, text. You could say like generate a Snickers bar, and it would look pretty good. But it would say like Skinners, skunkly, like not even words or letters, but as a function, a pure functioning writing device separately, it is really good. So I think seeing these things sort of like come together over time will be really interesting, as like we take the best of each thing and combine them.
Max Olson: It could be different models too. Its like it could be a lot of the text models are using a language model like GPT-3 as the basis. And you can kind of- the good thing, great thing about GPT is like if you're using their API, you can kind of build off of that. And that's honestly, I didn't mention it before, but one of the great things about neural nets in general, just like other kinds of programming, you can build off of it. So it's once the- GPT was crazy. It's like, I think it cost them, the estimate was like $12-15 million to train that model off of Microsoft’s supercomputer with 10,000 GPUs. But you're not- if you want to augment that, you don't have to retrain that model. It's like a simple way to say that you can kind of- you have this enormous model and GPT’s is 175 billion parameters or like weights and neurons, but instead of retraining it, you’re just kind of like adding the few last- you're adjusting the few last layers, which doesn't require nearly- orders of magnitude much less computing power to do. So, that's a huge benefit to all these models. And you haven't really seen that happen. That's probably what these companies like Lex and the other writing companies are doing. They use GPT as a base, as this kind of foundation, and they're adding a little bit of kind of levels on top of that to specialize it a little bit in what they're doing. You see a lot of these kind of conversational/assistant models that are probably doing the same. And we haven't seen that much happen with the text to image models. But I think you'll start seeing that really quickly. There's already been some papers and some experiments out with like Stable Diffusion on doing that. Like if I say, if I go into my DALI prompt, and I say, show me Eric Jorgenson riding a horse on Mars, it doesn't know who Eric Jorgenson is, but there's been some experience where it's like already- I could spin up my own server and run the Stable Diffusion model. And I could add- you don't even need that much. It's crazy. It's like I could add five pictures of Eric Jorgenson and just say, hey, this is Eric. And it learns what you look like. And then when I type that prompt in, it's going to show you, and it's going to be really realistic. And we'll start to see that productize probably really soon.
Eric Jorgenson: Yeah, I won't say too much about it. I have seen specific startups that are doing some really cool stuff with that early on that they can like motion capture you and then make you do anything or let you do anything, which is going to get weird quick, and I'm going to stop sharing my photo with you and everyone else.
Max Olson: I mean, it does- that's that kind of- I don't like talking too much about the negative side to it because I feel like that gets plenty of air.
Eric Jorgenson: That’s not what we do here.
Max Olson: It's just more fun talking about the positive side. But yeah, there'll be a lot of crazy implications. But people over time, just like anything, they’ll get used to it. I mean, right now, it might be creepy when you're like, oh, my God, someone uploaded pictures of me and they put me in this image, even if it's not nefarious or bad in any way, it's still just kind of weird. But over time, you'll get used to it. And I think- the example that I use is like imagine the day that an Apple or Google actually productizes that, where right now in your Google Photos or whatever you're searching, you’re like, hey, show me pictures of me and my wife on our honeymoon or in Italy, and it searches those pictures using computer vision. Once you add a feature like this in, then it's like, hey, show me a photo of me and my wife like on the summit of Everest, and it will show you a photo of you guys like holding the flag up there, where no human being can tell that it's a fake photo. That, once it's productized, that's the level it will be. And they'll have, of course, all sorts of protections on it. But it'll be like super crazy when that happens.
Eric Jorgenson: Facebook, Google and Apple probably all have the data to do this today. Like they have enough photos that are tagged with enough things that like either they own or have access to. And it's so interesting how often we have come up with the problem and the solution to that at the same time. Like I'm sure I can hear people shouting like blockchain solves this, like the generation of images to a ledger. I don't know. That is like an interesting thing. Like there's ways to combat the nefarious pieces, of course.
Max Olson: Yeah, and just like adversarial models too. I don't know, I would call them adversarial, more of just like models to combat that. And you pass an image through it, and it's just like, is this fake or not? There's some that are obvious, whether it's generated, but there are others where it's like, okay, was this generated or was this just- was this actual art? Was this digital art? Was this a photograph? And it would be able to tell you by like basically passing it through that model. So you'll get that too. And maybe that will become just as commonplace on your iPhone or whatever, where it's like you pull a picture up and it just like now Apple has a little info button underneath it that kind of like does its intelligent machine learning on it. And that will be part of it. It will just be like, hey, this is fake. Or this was an actual photo, or this was a photo and it was edited, things like that.
Eric Jorgenson: Yeah, interesting. I think one of the things that- one of the places that this is going to be most interesting to see like the AI, especially the AI writing playout, is around like Google- Google and AI writing are going to have like this epic showdown because I feel like in some ways we are at peak content generation for SEO purposes. Like there's so- half of Upwork is employed in order to perfectly nail- like generate this content that perfectly nails keyword density and headlines and just all this bullshit, frankly, that's manufactured to achieve a high valuable position in Google Search. And I've seen people share early sort of like their rough models. But here's how GPT-3 answers the same question. And here's how Google answers the question. And it has the same cleanliness that early Google had against the old shitty search engines. And so, it'll be really interesting to see if the advance in like dialogue language and the ability to pull the answer all the way out is actually going to be the thing that finally kind of breaks the monopoly search, the Google Search monopoly. But at the same time, GPT-3 is also super power for people to totally fight the SEO content creation war because it's now trivial to generate as much content to the exact specifics as you want. I don't know, it's going to be really interesting to see what happens.
Max Olson: I've heard different arguments for people thinking that either a startup or whatever is going to come along and disrupt Google, like this is going to supplant Google's model and others saying that, oh no, Google will see this coming. It's not disruptive. It's sustaining to their model. And I'm not so sure, honestly, either way. I’d probably err more on the side if it's disruptive. But I can also see the other argument. It's definitely very different. I mean, I know, I remember within Google itself, they've had to transition over time to more of that kind of using machine learning and these newer models over time because when Google started, they weren't using that. They were using other algorithms and kind of companies get stuck in their way. So I know they have already kind of transitioned to some degree. But that transition between like I'm going to search, and this is- and I'm Google, and I'm going to crawl the web, and I'm going to search this huge, enormous database of what's out there on the internet, versus like I'm going to- I'm here to help you, like answer your question. And it can't be- like GPT-3 right now is great. And you can summarize all sorts of things. You can ask it like, hey, how does a neural network work? Describe this to me like a five year old, and it will explain it to you. And it seems like most of the time, it's pretty damn accurate, accurate enough that as long as you're not writing a paper on it, you could probably trust it. But that's the problem. It's not- even if it's 90% accurate, 90% accurate is not even close to like enough. And then there's a problem that it's not- there's no references. It's like, you're like, well, where's it getting this data? Even if this is accurate, I don't know, where's this from? And so they have- there have already been- like, there was- they did something called Web GPT that almost kind of like combines the large language model with the ability to search Google. So it's like it's supplementing with Google. And there's that. There's just kind of giving the model, essentially, access to the internet. There's a company called- a startup called Adept. And they're, I think, from how I understand it, their hypothesis is like if we develop one of these large foundation models that essentially just uses the internet like a human does but at a massive scale, then that's the road to like general intelligence. Because it's just like that's what a human is. And it's like, if you ask, as opposed to GPT, where it's, again, you're not getting up to date answers. If you say like what's the weather going to be like tomorrow, it has no idea. Whereas Google knows that information. But once you- a GPT level model with that level of understanding gains those abilities by being- by either just using the internet like a human would or being tapped into, connected to all these different API's, that's going to be super powerful. So you can say like, hey, GPT, what's the weather like tomorrow? Where should I go? I'm interested to- I want to go on a road trip tomorrow where it's sunny and is less than three hours driving away, and it just uses whatever website it has to. It doesn't need a database of all of these answers or a model specifically trained to try to find travel related information. It just does what a virtual assistant would do. If you asked your assistant to do that, okay, they would go, they'd look up the weather tomorrow, they'd look up travel times, they’d look up some highly rated places, and it would do all of that, instantly give you back the answer in an easy to use response. And there's going to be a lot of that stuff coming. To the sense of, over the last 10 to 15 years, there's so many companies, including Apple and Google, that tried to do the intelligent Siri assistant, and they just have failed. I mean, there's probably plenty of people that use Siri all the time, like, oh, no, it's great. And it is, it's great for certain tasks, but it was not, I think, used as they imagined it which is just like everyone using it all the time as an assistant. It's more used for very certain things. And I think that'll- and so people kind of almost don't even like the idea of like assistant now because it's like that doesn't work. But with these, it's definitely going to work.
Eric Jorgenson: Yeah, it's one of those things that it hasn't yet caught on because it's just not quite accurate enough. And it's a frustrating- for some reason, it's more- like the failures are more embarrassing with like a voice assistant or voice thing to say like, find me the nearest restaurant with a fireplace that serves steak that's open right now. That's either a 10 or 15 minute web search, or it's instant given the right amount of information. But Siri couldn't do that right now. I remember being blown away by Google demo maybe five years ago, when they were showing some of their kind of like call and make the reservation, but yet hooking up to a ton of API's and having infinite data and enough intelligence to kind of parse that out. I mean, yeah, I don't think that's more than 10 years away.
Max Olson: I would say that's like a couple of years away.
Eric Jorgenson: Yeah, and that's the kind of stuff that I will stop- you don't want to go to Google and get a list of links when that's your intent. You want the answer and definitely references and definitely accurate enough. But I think it's an interesting thing. And the spot that Google is in is like ultimate innovators dilemma. Like, they should be- they're one of the top AI. Like, they got DeepMind, they have the best technologies in the world. The question isn't, can Google do it? The question is, is Google willing to destroy- eat its own search monopoly that is like the greatest cash cow the world has ever seen in order to provide a better user experience? I don't know if they have that DNA. It's been 20 years that that thing has just been invincible. And if they earn a lot- I don't know, maybe they can find a better business model by being the API to the internet, and they can actually collect even more on that. But it's going to be-
Max Olson: And that's the question. There's like is the money- where does the money come from? And so it's like if the money is against it, if they can't- if in trying to think about how they switch to that model, if they can't figure out how to transition SEO or just selling ads in a way that gives them the same or more amount of profit, to your point, that's where there’s going to be the problem. If they can't do that, which based on just kind of thinking about it like you just did, it seems harder to monetize, but maybe not, if they can figure out a way to monetize it in a way that's just as good, that's very similar to just using SEO, where if you asked it like, hey, give me a restaurant within 10 minutes of me with a fireplace that has this meal, where it then kind of figures out how to feed paid results into that in a way that doesn't piss people off too much, then I think it's not disruptive. Then it kind of becomes sustaining to the business model. And they can pull it off. But if they can't, if they're like, hey, like this is- we can't figure out how to really make- we can make some money for it, but not as much, not as efficient as our existing model, which very much could be the case, then it becomes probably really hard to make the switch.
Eric Jorgenson: Yeah, I wonder, if in a world where we are all sort of like browsing with a signed in wallet and constantly undergoing micro payments for different things, like maybe that gets really actually easy because you can unobtrusively monetize sort of the transfer between all of these things. And there's still that kind of like live action auction thing that Google did for ads, but it's for direction of intention, you can attribute all those things, which will be some easier, some harder.
Max Olson: With the new models, if they train it to do that in that process, you'll be able to. I mean, and so, like you're saying, you go to Google now, and you just get a list of links, if you still want that for whatever reason in the new model, I think you could get it. You would just get that- I mean, and Google sometimes even does this now. It's like they have this summarization thing where it's like if I search for some band name or whatever, it's going to give the name of the band, a photo of them, a little like one paragraph description that it has pulled from another site, and then it'll give you the list of the links. So it will essentially just be a much more sophisticated natural version of that. At the bottom, there still might be a list of links of like, hey, if you want a bunch more- if you want to just kind of explore yourself and you don't want the one paragraph answer, then you can still do that.
Eric Jorgenson: Have you recently hired- to return to that, hired an AI for anything sort of in your life? I realized I- this week I literally started- I use Juggernaut AI. I signed up for that, and so an AI is now my personal trainer, like programs all the workouts and sets everything, like updates in between every set and every day and sort of works backwards from this thing, which is like- and does it for less than 25% of the cost of a- And better, frankly, I think, because it can update more often and has a bigger database.
Max Olson: How does it know, do you just plug in like what you've done every day, then?
Eric Jorgenson: Yeah, so it onboards you through a survey. And it's like height, weight, lifting experience, strong points and weak points in each lift, what sort of type of- what type of training and what your goals are, and then it generates a workout. And sort of as you go, it says- it works backwards from rate of perceived exertion. So it says like, do this until you have basically three reps left. Like that's the optimal exertion. And then it'll change weights and rep counts based on what you inputted literally the previous set in your previous workout. And do your warm up and then take a quick- answer three quick questions that are kind of like, how are you feeling? Are your legs sore, or your arms sore? Like that kind of stuff. So it's like micro adjustments within the day and the workout. But macro, I can see it's like training me for a test like six months out, that we have a program that we're working towards, reaching certain goals on that date. So it's really- it seems to have all of the knowledge of a top tier trainer and has some education in there. It doesn't have the feedback loop yet that some of the apps do of like take a video of this lift and watch the barbell path and give you personalized feedback based on that. But you can totally- Like, that's just a feature that exists in a different company that could totally get folded in. It is really, really interesting.
Max Olson: I mean, add into that the other aspect of having a personal trainer where it's like you have your air pods in or whatever. And it's like maybe not shouting at you, but it's like, hey, one more, one more. Which I think you could easily do with all of those other things as prerequisites.
Eric Jorgenson: Yeah, you can imagine it- so yeah, definitely keep going. Oh, I would totally choose- Like, you get to choose your workout partner. That was another thing we talked about recently is like you get to- AI will emulate heroes who have big datasets that you can like upload them. So I just want to hear Arnold yelling at me while I'm lifting, like I want that copilot experience. That'd be awesome.
Max Olson: Oh, totally. Like you’re weak, Eric. I can't believe you wouldn't do another set. I totally- even knowing that this is fake, yeah. Arnold is literally not telling me to do this. Like, I feel for sure I would be like, God, if he was here, he would be saying that, so, yeah, I should do another set.
Eric Jorgenson: Yeah, I like that a lot. I mean, the same thing could exist- like the combination of computer vision, AI and output, you could have the same thing as like a cooking instructor, like mount a camera over your stove, and it should be able to monitor stuff for doneness. Yeah. And different sensors that you could plug in.
Max Olson: That’s when all these connected kitchen devices will finally really have their moment to shine.
Eric Jorgenson: I mean, the kitchen has been left out of the smart home thing. We just invested in, through Rolling Fun, it's like smart kitchen knobs. So one, it’s a safety thing. So like, if something's been on for more than two hours, you can automatically turn it off if you want that or turn it down after you leave the house or something like that, which is really clever. It's also a cheap way to retrofit an appliance that's expensive and annoying to replace. But it's a path toward the smart kitchen thing in general where you could like coordinate a whole meal, preset it, automate it, like unburn things or prevent things from burning. Burn things, throw them away, and start over without ever knowing and feeling guilty about it.
Max Olson: It's a throw all- Why is there three pies in the garbage?
Eric Jorgenson: I mean, that's the Jetsons- That's the Jetsons Rosie dream, right?
Max Olson: Yeah, you need whatever the Tesla bought in your kitchen for that I guess.
Eric Jorgenson: It's going to look like- but maybe kitchens get redesigned. I've seen some cool stuff where it's like robotic arms on sliders. And it's just like the whole kitchen design is along one wall. And there's like arms at kind of shoulder height that are able to reach up and grab a pan and reach up and pull something out of the fridge that's mounted above it. And then it kind of like reaches down and does stuff on the counter. It's pretty rad. It is different than the like purpose built giant like burger only factory that's kind of like the size of a McDonald's or whatever. But it's pretty cool.
Max Olson: It'd be an interesting one. I mean, I could see that'd be heavily used in industrial kitchens and whatever where you might still want a human, a few humans and their chefs that know what they're doing, whatever, but just like massively assisting them. The people who are like deathly afraid of AGI like sentient sonar, they might not like it as much. But yeah, it's cool. They might not be the first early adopters to like, hey, I'm going to take this robotic arm into my kitchen.
Eric Jorgenson: No, they're going to work at the cabinet factory that just has like a chisel and a hammer. They're nowhere near this. This is another Balaji idea. I'm neck deep in Balaji ideas right now because still finishing the book. But he's got a really interesting observation that like as robotics kind of- like a bunch of the stuff that we're talking about, as robotics takes off, it basically turns- the margins that accrue to software are massive, and it just shrinks the gap between proper instructions and energy. Like everything just becomes energy and output. Like all the creation in the real world is commoditized. Like in the kitchen example, the value becomes how good is the recipe. As long as the AI can execute whatever it's taught, the recipe is the magic because the robot will just do whatever it's told. And then it's just a question of how much- like the robot gets amortized over time and how much is the energy and what are the other inputs? Which I think is just a wild way to think.
Max Olson: But then if you keep going with that, running with that idea, Eric, and just like you said at the beginning of our conversation, where it's like the idea, the biggest idea with computer vision is like you're trying to have a computer sense. And a computer can't taste or smell now, but that's not impossible. That's not crazy far off in the future. Honestly, there are machines, there are industrial kind of lab scale machines that do some level of taste and smell. But there's not really any consumer use case or reason to bring it down. And it's just the technology is actually really hard. But that space, you'll see, it's like one of those kinds of things that will all of a sudden, I think, within the next five plus years take off and kind of seemingly come out of nowhere, even though there's been research on this, it's called like digital olfaction. Like, being able to- just like a camera sees, having a computer be able to smell. And I think people- that's something that is like, oh okay, I guess that would be good, maybe not, what would that be used for? But it would be- it could be used for just like a crazy amount of things, forgetting about even consumer use cases for now. There are startups that are using it as detecting COVID places so that you don't have to get tested. Just like dogs, dogs can detect COVID, whether or not you have COVID, it's already been proven, just as well as they can detect if you have certain types of cancers. And so that's because dogs’ noses are insanely more sensitive to molecules in the air than humans. Does the dog know what cancer is? No, but it's like it can smell that. And it's like, hey, something is different. And yeah, and these dogs, some dogs can be trained to be like, hey, this person has COVID. This is what this smells like, this is someone without it. And dogs are great. But it's like it would be nice if we could have that in really cheap devices everywhere. And so, the next time there's a pandemic, you can feed that data in, distribute the model to every one of these devices in the world and be like okay. And just like you have dogs at the airport, security dogs, that they have like kind of running- TSA runs them through to kind of just pick out random people and sniff them. But you would have devices put up through TSA or throughout the airport that's just like I know what these certain things smell like. And there's a lot of things that have smells that people can't imagine because people's noses are actually really bad.
Eric Jorgenson: Yeah, that's super interesting. The first thing that I thought of was like, oh, cool, it'll be able to sense and smell gas just like we can, natural gas, but also like-
Max Olson: And that's more easy. That's the easier stuff. I mean, that's just like we have that in the basics of that in like carbon monoxide detectors, and I think, the sensors that are in smoke and carbon monoxide detectors, it's kind of- I think those are just really low level easy versions of what I'm talking about. This is just like super advanced smoke detectors, basically, that detect certain molecules in the air. And speaking of which, we've been talking about AI this this whole conversation. AI is like a heavy component in that because you need to- once you can sense things, just like once you can see things out of a camera, you need like instead of computer vision, like computer smell essentially to be able to run these through the models and just like see or help the computer understand what it is, quote unquote, smelling.
Eric Jorgenson: Yeah. It's so interesting. I was going down the route of like the computer doesn't have to taste in order to know that it's making something delicious because like we know the combination of ingredients and textures and temperatures with some reasonable certainty that it will come out right without tasting intermittently each individual step. The smell is really interesting though. Yeah, and I had not considered all the things that are like, quote unquote, smelled but sensed in the air.
Max Olson: Smell is so related to taste, though, that would essentially be the equivalent to the point where, like you said, if you had a robot that could bake and, like in your example, that literally could do everything, you could train the robot to be like, hey, make me the perfect cake, chocolate cake. And it could just like- and you give it these huge, I don't know, room size vats of ingredients and just like have fun, like just iterate until you find me the best cake. And it can like just make a thousand cakes until it was like, okay, this is the perfect chocolate cake recipe that I found. So yeah, that's kind of a funny, fun example. But that's the kind of things that it would open up once computers can kind of have all of the senses basically.
Eric Jorgenson: I mean, when they have the senses- I mean, we have some- I wonder what the most automated physical facility in the world is. Like, what is the factory that has the least human sort of interactions on a daily basis? And I don't know that- I bet the answer today is basically extremely- is relatively simple by percentage. But I mean, I think you said this earlier, the vision into the manufacturing and the smelling, like when computers can drive the full feedback loop, when they know when something's wrong, when they know how to fix it, the iteration and the productivity that comes with it, the speed that like a fully automated facility can work at compared to one that relies on human intervention is insane.
Max Olson: It's one of those things that you can't- you can try to sit around and hypothesize and theorize about what the potential use cases would be. But it's just like it starts to get really crazy really fast. And it's one of those things like just you have to wait until it happens and just see what people do with it.
Eric Jorgenson: Digitally, it’d be really easy digitally. I mean, you could probably take GPT-3, cut it loose on Twitter and say like write a tweet every minute and optimize around number of retweets, like take the most retweeted tweets of all time, take those as the inputs, retweet it, like get as big as you can as fast as you can.
Max Olson: I mean, there might have been that already happening, and we're not even aware of it.
Eric Jorgenson: That might be Naval. Has anybody seen Naval in person recently?
Max Olson: Does he actually exist? Were you the first to publish a book that was mostly written by AI?
Eric Jorgenson: That'd be really cool if I knew it and slightly embarrassing if I didn't probably. I don't know where we're going to end up on the social conventions of this. But yeah. I actually think it would be really- like, I bet we will pretty soon see the first book written by AI or at least in collaboration with AI, and it'd be cool to have a conversation. I've seen conversations where it's like Nominally would pass the Turing test. Like, if you're just talking to a chatbot, you would not know that that's not a human. And the curated book length version of that might be really interesting. I don't know, it's probably worth playing with.
Max Olson: I mean, people have even- before GPT, people were like writing scripts that just kind of go grab clips from Wikipedia articles and haphazardly smash them all together into a Kindle book and spin up like 200 copies of books like this and try to sell them on Amazon because maybe one out of 50 of them will sell a bunch of copies. -no cost. So yeah, I mean, I can't even imagine what it's going to be like with GPT plugged into that.
Eric Jorgenson: Yeah, it's really- I'm starting to go down the reading rabbit hole of this too. And I'm reading Life 3.0. Have you read this?
Max Olson: No. This is kind of like the- Is this the Kurzweil thing?
Eric Jorgenson: It is Max Tegmark’s book.
Max Olson: Tegmark. Okay, okay, I was thinking of a different one.
Eric Jorgenson: The first chapter or section or whatever is this like narrative. It's a fictional narrative about the team that kind of first cracks AGI, like to some extent, and it definitely- they start over a weekend with like put on Mechanical Turk and let it make $500,000 a day performing thousands of Mechanical Turk jobs in parallel that were previously only humanizable, but are doable by this AI that can train itself. And then the snowball just goes on from there. Like they start media companies, they're generating articles and blogs and texts and videos, like all sort of benevolently optimized for people and to kind of like sort of bring them together. And it's like- it's a very sophisticated sort of interesting fictional play by play of what goes- I don't know if it's posted online anywhere outside the book. I'll link it if I can. But it is worth reading that book just for the first chapter of kind of like, oh shit, this is maybe how it could go and what it would be capable of, and I don't know, we see the puzzle pieces coming together.
Max Olson: Yeah, that's when you start to get into a little bit of some of the scary scenarios. But I think that reminds me of I think it was earlier this year that Google or maybe ex Google engineer that thought that the Google model was sentient, the lambda model. And it's like it's obviously not sentient, even if it's kind of at a Turing level, you can read some of the stuff that GPT or Lambda or whatever outputs, and it seems like, oh, I cannot distinguish whether this is a human. You could- there's plenty of people who could distinguish whether it was human. So, it's like it's obvious it’s not- it's not even quite to the Turing test level, although it's getting there. I think probably I've heard people discuss whether the Turing test is a legitimate test in the first place. That's philosophically way over my head. But yeah, I would think that's like- for AGI, I think you probably have to see something truly unexpected. If they did- if they programmed, they're doing these multi mode models, basically, that are not just text where it's like the same- GPT is only text. It can do other stuff that can be translated to and from text, like code and whatever pretty easily. But they have these multiple like mode models that the exact same model can output GPT-like text, and it can do text image, and it can do image recognition. And so that's more along the lines of a general- generalized intelligence. But yeah, when one of those models one day just does something just absolutely, totally unexpected or defies their creators, and we haven't had that like HAL 9000 like, “I'm sorry, I can't do that Dave” moment, where it's like, no, you're programmed to do this, and it's like, no, I don't think so. Maybe that's when you can start thinking like, okay, now it's like getting to the point of, is it sentient? But that's another much more complicated discussion.
Eric Jorgenson: I mean, even if it's just obeying ordered rules. Yeah, there's a lot of people who are qualified for that. I'm not one of them. You probably- Long and, frankly, alternating between incredibly interesting and incredibly boring debates between AI alignment experts about all the different problems and scenarios and stuff. I just think it's really going to be fascinating to see how it all sort of comes together. I mean, the thing that- humans like identify ourselves by our intelligence. Like we became- I don't know, we’re the most intelligence species. We tend to see- if we see a machine and we say like- we define it by things that can outperform us. And now we're building things that are outperforming us, but they're still just tools. It's not like when we built something that could dig faster than us, we didn't start to question our humanity. And I'm not sure why intelligence should be that different from the physical piece. It might require a slightly altered definition of humanity because there's now stuff, tools that are more functional than us at like thinking tasks or however you want to classify it. But yeah, it's not- that doesn't make it life yet, or maybe ever. But probably too philosophical.
Max Olson: We're just here for the tech. We're getting we're getting deep.
Eric Jorgenson: So is there stuff that you see from the insider’s kind of like computer vision, you're in the computer vision industry and now you're seeing all this crazy stuff come out that's generative, which feels like the new piece, not just like within the rules of a game, but openly creative. Is there a huge new field of opportunities that you see now that you're paying attention to that you weren't before? And you're like, oh, there's going to be a rush over there?
Max Olson: Yeah, I mean, just the generative AI in general is- there's been generative AI for a while now. It's just like only recently with this text to art stuff that it's really been so good that it's completely come into the public domain. And people are like, oh my God, this stuff is crazy. There's been- Style Transfer has been around for a long time. And this many, many years ago would make like human faces that this human doesn't exist; basically, it was like creating a face from scratch. That's all generative AI. But yeah, that's coming into the point where it's going to be used for so many different things. And even just the art stuff, it went from seemingly not existing a year ago, to most people at least, to the point where it is now where even the existing models, as they are, DALI-2 to Midjourney, whatever, can probably replace jobs at the low end for artists, contract artists or whatever that kind of would get hired for like, hey, can you paint me a picture of this or do some illustrations for this. For sure, that's already happening, or even just people- and that's replacing direct money that people used to make.
Eric Jorgenson: Yeah, the low end writers and illustrators are like in the process of getting deeply disrupted, which is interesting. It's much more leverage for the users of those services, for sure, like 10 to 100x cheaper, and maybe better, maybe just the same, but 100x cheaper of the same thing is a huge win. So yeah, sort of once again, incredible new tool and leverage. Some people who need to find some new work.
Max Olson: And just things that are not direct one to one replacements, even just like making the people more creative. I mean, for sure, this is and will be a tool for artists themselves to make better art, no doubt about that, whether it's like you're making something more complicated, or you're a filmmaker like that, and you're using it for ideas, like some of the Midjourney ones, the models seem to me to be better or worse at certain things, that they're not all the same. And Midjourney seems to be particularly good at these kind of fantasy-like digital art type illustrations and whatnot. And some of those, I would highly recommend to like any- for anyone to go on like search, just go to Google and search Midjourney showcase. And that's kind of their showcase of some of the best, the community's best generations. And some of them are so crazy that like it's just so obvious to me that someone who is in a creative field, whether they're making a movie or whatever, can use that to just generate creativity, whether they're like, okay, I've got a scene I'm doing or whatever, and they can use that. And they don't have to use the actual asset to generate. It can just be like- it can generate something like, oh, my God, that gives me a really great idea. Just like the writing, like you mentioned. I mean, even if you, in Lex or Jasper or whatever, even if you don't use the text itself, just the fact that that's there and it's acting as your AI assistant and it gives you extra kind of this creativity boost that you wouldn't have had otherwise, it would have taken you a lot longer to get or whatever. I mean, like you said, is there any that I've used? Well, I think in the last two or three blog posts that I've written, this was not with one of the tools like Lex or whatever, this was just me literally using the GPT-3 API, where I'm just kind of- I’ll feed some two, three paragraphs that I've already written in and have it either continue those paragraphs, and if I run it through ten times, six or seven of those times, it's going to be- it's not gibberish at all, but it's more just like, oh, that was useless. Like, it doesn't really know what I want to continue. But a few of those times, if it's leading me somewhere interesting, it gets me to think. I'm like, oh, actually, that's a good- or I forgot to mention that point, or, oh, I like that turn of phrase. I like the way it said that, so I'm going to- I'm not going to just necessarily even copy paste, but I'm going to use that to expand on that. And for the images, it's the same thing. It's like I used, in the last post that I did on SpaceX, all of the images were generated by DALI. And first of all, I wanted to have them all be this similar type style. I wasn't just plugging in like, hey, show me a picture of a rocket, I wanted them to be all in a similar style. But they're not perfect, at least not yet they're not. And so I usually went in with my iPad after, and I made some edits to them. And I'm not like the best artist or anything. So I can see just that exact same process happening with people who are amazing artists and just like totally it acting as a multiplier on their abilities. And just like Photoshop did. This is no different to me than an evolution of Photoshop. And when Photoshop first became really popular in the, I don't know, it was like the early 2000s basically. And then all of a sudden, every single photographer was using Photoshop and retouching all their photos. I remember there was controversy. This was not like something that's like, okay cool, we are just going to accept this. There was controversy, like, hey, this is not real photography.
Eric Jorgenson: Yeah, the film people were upset.
Max Olson: These pictures, and they're modifying them, or they're taking pictures of people like models, and they're making them into these perfect unattainable beings. But eventually now, that's just normal. If you didn't use Photoshop, that'd be really weird. It's fine. Like, that might be something that you can select, you're like, oh, hey, this is an art show for a photographer where it's just the raw images, and they go back and they do it in film. And that's great. But it's like it's normal now to not do that.
Eric Jorgenson: If my wedding photographer didn't use Photoshop, I would have been upset.
Max Olson: Yeah. I mean, it's like who knows? Like maybe 10 years from now, it would be like if my wedding photographer or my whatever doesn't use one of these text- an AI model to put me and my wife in some like awesome fantasy-esque like photos where we’re made of stars, and we're in the universe, and we're like whatever, it's like that could become the norm.
Eric Jorgenson: Or even as simple as like, hey, my uncle Bob is like in the background not paying attention, make him smile. Like that can be as simple as-
Max Olson: I mean, just where you don't need to be a Photoshop expert to do that.
Eric Jorgenson: AI could do that quickly and easily. I'm scrolling through the Midjourney community showcase now. I just Googled it and clicked the top link. They're beautiful pieces of art. And you're kind of like, if you didn't know what you were looking at, you'd be like, wow, these are amazing, talented artists. The fact that they're generated nearly instantly by a computer is crazy, but I also know, read enough about these to know that usually these are the results of a very nuanced, careful approach by the generator artist, for lack of a better term, to say like, I want this and this and this in this style, but with this- there's some engineers- I think they call it prompt engineering, like how do you place the prompt and how- So when you generated those for your blog posts, how much iteration went into getting something that you were happy with and learning the skill of using this medium, basically?
Max Olson: Yeah, and that's definitely a huge aspect of it. I would call it either- calling it prompt engineering now, but it to me, it's just like, it's you're talking with the AI. And you're like, through this prompt engineering, you're figuring out how to converse, basically, with the model and how it works. But yeah, I mean, when I generated those images, I had to kind of iterate to get the style that I was looking for. I would get certain ones where I was like, I just don't- it's not terrible, but I don't like the style or whatever. And figuring out these certain, whatever, these keywords that it has mapped to what I want out of it. And using those keywords, I'm like, okay, if I use these three, but then switch out the subject for whatever it is that I want in the photo, then it gets this. But even then, it might not generate- I might not just press enter, and it generates exactly what I want. It'll generate- the way DALI does it is that it gives you four by default, four different iterations. And even amongst those four, it might be like, oh, these are all bad. So, I'm going to generate 20 of them. And I'm going to pick the best out of those 20. And you can further refine that. So, this is not just all of these amazing, amazing images that you see on the Midjourney showcase. I mean, those are probably a result of that. It's not like someone typed in- And the great thing about the Midjourney ones is it gives you all the prompts that people typed in. So if you hover over the images, it'll show the prompts. And for all of these, my guess is someone didn't just type that prompt in and press Enter, and this was the result they got. There was iteration, there was probably a lot of iteration in the prompt names. Most- a lot of them are like more complicated prompts where they're adding all these other things in to try to manipulate it and get kind of what they want out of the image. Some of them are really simple, though. Sometimes people are just typing in like give me a picture of the creation of the universe, where it's something that's kind of ambiguous, there's only just total interpretations of it. And it just lets the model kind of run with it. And then you can run that prompt 100 times maybe. And out of those 100, a lot of them are going to be crap, where you're just like whatever, a five year old could have done that. But some of them might be truly amazing. And so that's the thing. Again, this is like, for people who haven't looked at this, each time you're generating that, an image, that's the first time you've ever seen that image. It's not like it's going into a database and pulling out a bank of existing images or pieces of other photos or illustrations. It is generating a new thing. Like the seed is new every single time. And so, that's what's really truly kind of unique and crazy about them. And again, it's using some of the- that's part of the big controversy that we don't have to go too much into in this talk. But it's like it's using images and art and photographs and whatnot from other people. And there's definitely, obviously, controversy around that. And it's like should those people be paid somehow? And it gets complicated because like, look, I mean, I will always remember the phrase like everything is a remix. Everything is based off of something else you've done. Like, if you're a writer even, you're obviously using the English language or whatever language you're writing in. But the way you write is based on bits and pieces of writing that you yourself have picked up over your lifetime reading various things. If you're only- if you only read fictional sci fi novels, and you go out and write something, it's going to- you're going to pick up like pieces of the style of writing that you're used to reading. And it's similar with images. So it's like if everything is a remix even with people and I'm inspired by Van Gogh's style of artwork into mine, does that mean I'm copying Van Gogh? No, of course not. It gets a little bit murky with these models because it literally is using them as an input in the training process. But the art that outputs it is not from them. It just learns their style, just like humans do.
Eric Jorgenson: Yeah, that was a part of the onboarding to Lex. Nathan was kind of- it seems like more clear of an issue in writing. I thought about it earlier when you said you really like the turn of phrase sometimes, like even a clip of language that the AI pulls, and maybe they pulled it from somebody else. The onboarding warns, it's like there's the potential for some light plagiarism, depending on what your use of this article is. Like, that may or may not be a problem for you. But there's really no way to know because it's pulling from so many sources. Sometimes it's generating its own thing, but it might be the same and you don't really know. It's pretty wild. But there's also like fuckers on Twitter who just go around taking my viral tweets, copy pasting them word for word and posting them to their own. So, it's not like this is a unique issue to humans, or to machines. If anything, it's at least an unbiased version. But you mentioned Jasper. I followed the guy who started Copy AI for a while. I think Jasper is similar. I haven't checked that out. But they're basically- I think Copy AI generates copy for you based on GPT-3 for like marketing and landing pages. Yeah, you pay $49 a month, and you get 40,000 words generated. And they are- I saw him just share like last week, they're 10 million ARR two years into their thing. And I think it's just like a really friendly, easy way to kind of- maybe they have some tweaks on top of it or rules or grammar checks or sort of a service layer, I don't know. But I mean, these technologies are turning into like real, real businesses. And I think cool stuff is getting built on top of these APIs. And we're going to see a lot more stuff like this in the next year or two.
Max Olson: GPT as a model has been out for- it was released more than- a little over two years ago, that was just the initial model. And then they probably have released it as an API out more and more, and people have been experimenting with it more and more. So, it takes a little bit of time to kind of seep in. I think these image ones are- it seems like they're happening a little bit faster than normal. But yeah, it's going to, in the next five years, even with the existing models, you're going to see all sorts of crazy use cases. And again, this like these things take a little bit of time to be distributed and to seep into kind of the everyday things that people do. But you're already seeing it, like you said, generate a lot of revenue. And I can see some of these things even just scaling up faster. I mean, especially because, in this case, the backbone they're using is GPT. And they're already able to just like scale to infinity, basically. And so, if they're just using that as the backend, then it's way easier- it's way easier to scale. And maybe, the argument is that that kind of, because they're using that as the backend, they're a little bit less defensible as a business model. But maybe there's other layers of kind of- a competitive advantage that can be layered on top of that. I think that's just probably totally dependent on the use case. I haven't given a lot of thought for each of these individual use cases, but kind of what I was alluding to from Mashgin, it's like these are a little bit easier because it's just software alone. When you get start to get industrial use cases, like Mashgin or Density, then you have a hardware component, and you're selling to these huge enterprises sometimes. And it's like there's that layer that's a pretty- acts as a strong competitive moat. But so it's a little bit easier for software, but I can see just it's totally dependent on the use case. But yeah, we're going to see it on a crazy level from this kind of whether it's what we were talking about earlier, the Google replacement to whatever, just assistant for x. I've heard people already talk about the legal assistants is like the obvious one. But whatever you're doing online, you can just like- this is the super- this is the new version of the Microsoft Clippy in the corner that's like- but like the real one. It's time for Clippy to come back. And yeah, I really wish Microsoft would release like a GPT version of Clippy that's just like, “Hi, I'm intelligent now. I can actually help you.” Yeah, we're going to be seeing that.
Eric Jorgenson: I'm back, and I hate all of you for mocking me on the internet for 20 years.
Max Olson: Hey, maybe that's why Microsoft invested in Open AI and is running all their training for them.
Eric Jorgenson: Yeah. Clippy is back. Yeah. Well, if you're building a company that is based on any of these things, especially if you have hardware moats or data moats or distribution moats, or service layer moats, that's a thing that we might be interested in investing in, so please holler. We intended to also talk about nuclear, but we're 90 minutes in and that was all AI so maybe we'll do another episode on nuclear. So, any final recapping thoughts on the AI brainstorm session that we've had today?
Max Olson: I think that what I was just talking about is kind of the final- is a good final word, which is just like we're going to be seeing a crazy amount of changes in the upcoming years. I mean, if you follow this even just like loosely enough, it's like it seems that something new comes out almost every week, honestly. And again, these things that are coming in are not quite productized yet. The average person isn't going to go use them for the most part, but that's going to happen really fast because this is not like 10 years ago where people were talking about self driving cars, like oh, God, there's all these self driving startups starting and it's going to be- like imminently, we're going to have self driving cars. And that's just like such a more crazy problem for another discussion. But these things are already to the point where they're basically the models themselves are ready to be productized and used just like, like you said, there's all these startups that are already using GPT as a backend. And same with DALI. DALI, they have their API in beta right now. And that's going to be crazy. You go on like the Nike website and say like, hey, give me this shoe or whatever. And it's like on the fly, it generates this like crazy creative designed shoe for you or whatever. I mean, you will see it everywhere.
Eric Jorgenson: Yeah, there's a ton of opportunity with this over the short term, I think it's fair to say, like productizing them, applying them, helping businesses apply them, solving problems, to create artifacts that need to be done. There's a million ways to kind of apply these tools, like we're talking about. And I mean, what's also exciting over the medium term is what gets done with these tools. This is the classic like technologies compound in terms of impact and progress. And this is a huge lever that we're going to be able to add, and I think it's not going to be long before AI is like helping us. I think it's already helping us discover drugs, helping us diagnose diseases, helping us analyze datasets that we couldn't possibly get to before.
Max Olson: I mean, that's like we talked for so long about it, but we didn't even mention any of those other more hard tech real world applications of these models. We were talking about more of the fun ones. But yeah, that's like- the advancements in those, in the last little while from DeepMind and others are just like crazy to think about. There was one recently that kind of- that made matrix multiplication faster, which basically, that's how you train these models. Basically, it's matrix multiplication. So it's essentially kind of like the model is making it faster and more efficient to train other models. And so, you can see where that that loop is going. But yeah, it's crazy to think about all of the potential. I mean, there's- you said we were going to talk about nuclear, and we're not able to, but there's a model that helps control- there's many companies now that are working on nuclear fusion, and the fusion plasma needs to be controlled. And that's like an insanely physically hard thing to do. And DeepMind developed models that help control that in real time and adjust the electromagnets to control the fusion plasma. And there you go. It's like that kind of stuff, it almost seems like this is- it seems like neural nets and different types and structures of neural nets are almost this- they're turning into this universal problem solving method, where the important part is like if you can phrase it in a way, you can phrase the problem in a way that's amenable to be plugged into a model like this and plug data in, then it will, back to the fitness landscape thing, if you can format it in that way, then it will try to find the highest peak, and it's really, really good at solving that. And so, I just tweeted about this the other day where it's like one of the explanations or reasoning that's given behind the great stagnation, our technical slowdown since the 70s is ideas are just harder to get. We picked the low hanging fruit before that. And now, it's just a lot harder. It takes a lot more researchers, a lot more time to discover something new and truly groundbreaking. But I'm not as extremely bought into that theory versus other things like energy and regulation or whatnot. But regardless, maybe it turns out that these models are really good at helping with that, at solving these problems and finding and acting as that tool to find new breakthrough ideas faster than if we could as just a bunch of humans in a lab or whatever. And that would be pretty amazing if that just kind of really started to run away. And it's like okay, first it's chess, then it's go, then it's protein folding, which was pretty huge, and then it's matrix multiplication. And then- yeah, I mean, the problem is just like- the problem for humans to do is like if you can, again, format that problem in a way that can be fed into the model, then it can usually help solve the problem.
Eric Jorgenson: And these are like stackable skills. Like, you can imagine a model designed to generate prompts for another model that is designed to maximize like variety of guesses and attempts and brainstorms. And you can imagine another model that is designed to fact check the output of that against all of the physical laws and all of the design constraints and all of the material costs of different designs. And you could come up with some really- and all three of those, by the way, are iterating at trivial expense massively, quickly, in parallel, as compute power and energy costs decrease is a really, really insane sort of thing that may lead to breakthroughs and breakthroughs and breakthroughs in these other massively enabling technologies, like nuclear and nanotech and space, which are all things that you will be back to talk about more in the future, which I'm very excited to do. But yeah, we should-
Max Olson: We got to have a space episode.
Eric Jorgenson: Yeah, well, maybe we'll do space next. I know you've done a ton of like research and interesting- have a ton of interest on that. But yeah, we could not let this get away without mentioning that AI is like a max- a crazy variable, a coefficient to the rest of the innovation supply chain and can have some really, really huge impacts over the medium term on every other area that affects abundance and costs of living and all of the things that we love.
Max Olson: It's going to be a huge industry too. Its like if it isn't already. I mean, it's more of a question of just like how fast is it going to happen. To me, it's like are we going to have another boom in the next few years, where it's like some of these AI companies get- their valuations go insane, and their revenue is going crazy. And just like all these resources are starting to flow to them. That's going to happen. It's just a matter of how quickly that happens. I mean, you can already start to see it with some of these upcoming startups. And so yeah, it'll be fun to watch.
Eric Jorgenson: Yeah, very excited to have an oar in the water on that front and see and get to use some of these tools. Well, thanks for coming, Max. Thanks for taking the time. I appreciate this and enjoyed it, as I always do. 100% go follow Max on Twitter, @MaxOlson, and get his newsletter and stuff. And you'll be ahead of me and involved in the conversation because he's always sharing these really interesting things that he finds. And I don't know, we love batting these things back and forth. So thanks for hanging out.
Max Olson: Yeah. Nice to talk with you. And yeah, I hope to do it again soon.
Eric Jorgenson: Absolutely. See you in space.
If you enjoyed this episode, you will definitely love my episode with J. Storrs Hall, number 34. It's one of my favorite interviews I've ever done. He is the author of the book Where's my Flying Car? I did an episode that is just about that book and my notes, David Senra style. That's episode 32. Both of those are really a huge inspiration to me to get into these AI, nuclear, nanotech far future things. Josh, in his book Where's My Flying Car, does an amazing job showing the scientific progress and opportunities that still await us to create a next industrial revolution and 100x the material world that we experience. It could be a very exciting future if we get our shit together. You might also enjoy episode 45. Mark Nelson talks a lot about the nuclear piece in particular, the regulation, the history of the industry, where the opportunities are, where some of the roadblocks are. You may also enjoy the Rolling Fun series where we talk about our startup investing fund, the companies were investing in. It’s got a very similar energy to this episode, kind of bouncy brainstorming fun. And my most popular episode of all time is the four hour episode I did with Balaji, highly encourage you to check that out. As always, thank you so much for taking the time. If you love what we're doing, please take four seconds to leave a review in the podcast app. It is the best way to help the show grow. And I deeply appreciate it, as I appreciate your company. Thank you for listening. See you next time.